All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v7 0/8] Optimize VMDK I/O by allocating multiple clusters
@ 2017-06-29  9:23 Ashijeet Acharya
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 1/8] vmdk: Move vmdk_find_offset_in_cluster() to the top Ashijeet Acharya
                   ` (8 more replies)
  0 siblings, 9 replies; 11+ messages in thread
From: Ashijeet Acharya @ 2017-06-29  9:23 UTC (permalink / raw)
  To: famz
  Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block, Ashijeet Acharya

Previously posted series patches:
v1 - http://lists.nongnu.org/archive/html/qemu-devel/2017-03/msg02044.html
v2 - http://lists.nongnu.org/archive/html/qemu-devel/2017-03/msg05080.html
v3 - http://lists.nongnu.org/archive/html/qemu-devel/2017-04/msg00074.html
v4 - http://lists.nongnu.org/archive/html/qemu-devel/2017-04/msg03851.html
v5 - http://lists.nongnu.org/archive/html/qemu-devel/2017-06/msg00929.html
v6 - http://lists.nongnu.org/archive/html/qemu-devel/2017-06/msg00947.html

This series helps to optimize the I/O performance of VMDK driver.

Patch 1 helps us to move vmdk_find_offset_in_cluster.

Patch 2 & 3 perform a simple function re-naming tasks.

Patch 4 is used to factor out metadata loading code and implement it in separate
functions. This will help us to avoid code duplication in future patches of this
series.

Patch 5 helps to set the upper limit of the bytes handled in one cycle.

Patch 6 adds new functions to help us allocate multiple clusters according to
the size requested, perform COW if required and return the offset of the first
newly allocated cluster.

Patch 7 changes the metadata update code to update the L2 tables for multiple
clusters at once.

Patch 8 helps us to finally change vmdk_get_cluster_offset() to find cluster offset
only as cluster allocation task is now handled by vmdk_alloc_clusters()

Optimization test results:

This patch series improves 128 KB sequential write performance to an
empty VMDK file by 54%

Benchmark command: ./qemu-img bench -w -c 1024 -s 128K -d 1 -t none -f
vmdk test.vmdk

Changes in v7:
- comment the use of MIN() in calculating skip_end_bytes
- use extent->cluster_sectors instead of 128
- place check for m_data != NULL
- use g_new0(VmdkMetaData, 1) instead of g_malloc0(sizeof(*m_data))

Changes in v6:
- rename total_alloc_clusters as alloc_clusters_counter (fam)

Changes in v5:
- fix commit message and comment in patch 4 (fam)
- add vmdk_ prefix to handle_alloc() (fam)
- fix alignment issue in patch 6 (fam)
- use BDRV_SECTOR_BITS (fam)
- fix endianness calculation in patch 7 (fam)

Changes in v4:
- fix commit message in patch 1 (fam)
- drop size_to_clusters() function (fam)
- fix grammatical errors in function documentations (fam)
- factor out metadata loading coding in a separate patch (patch 4) (fam)
- rename vmdk_alloc_cluster_offset() to vmdk_alloc_clusters() (fam)
- break patch 4(in v3) into separate patches (patch 3 and 8) (fam)
- rename extent_size to extent_end (fam)
- use QEMU_ALIGN_UP instead of vmdk_align_offset. (fam)
- drop next and simply do m_data = m_data->next (fam)

Changes in v3:
- move size_to_clusters() from patch 1 to 3 (fam)
- use DIV_ROUND_UP in size_to_clusters (fam)
- make patch 2 compilable (fam)
- rename vmdk_L2update as vmdk_l2update and use UINT32_MAX (fam)
- combine patch 3 and patch 4 (as in v2) to make them compilable (fam)
- call bdrv_pwrite_sync() for batches of atmost 512 clusters at once (fam)

Changes in v2:
- segregate the ugly Patch 1 in v1 into 6 readable and sensible patches
- include benchmark test results in v2

Ashijeet Acharya (8):
  vmdk: Move vmdk_find_offset_in_cluster() to the top
  vmdk: Rename get_whole_cluster() to vmdk_perform_cow()
  vmdk: Rename get_cluster_offset() to vmdk_get_cluster_offset()
  vmdk: Factor out metadata loading code out of
    vmdk_get_cluster_offset()
  vmdk: Set maximum bytes allocated in one cycle
  vmdk: New functions to assist allocating multiple clusters
  vmdk: Update metadata for multiple clusters
  vmdk: Make vmdk_get_cluster_offset() return cluster offset only

 block/vmdk.c | 537 +++++++++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 415 insertions(+), 122 deletions(-)

-- 
2.6.2

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v7 1/8] vmdk: Move vmdk_find_offset_in_cluster() to the top
  2017-06-29  9:23 [Qemu-devel] [PATCH v7 0/8] Optimize VMDK I/O by allocating multiple clusters Ashijeet Acharya
@ 2017-06-29  9:23 ` Ashijeet Acharya
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 2/8] vmdk: Rename get_whole_cluster() to vmdk_perform_cow() Ashijeet Acharya
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Ashijeet Acharya @ 2017-06-29  9:23 UTC (permalink / raw)
  To: famz
  Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block, Ashijeet Acharya

Move the existing vmdk_find_offset_in_cluster() function to the top of
the driver.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
---
 block/vmdk.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index a9bd22b..22be887 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -242,6 +242,18 @@ static void vmdk_free_last_extent(BlockDriverState *bs)
     s->extents = g_renew(VmdkExtent, s->extents, s->num_extents);
 }
 
+static inline uint64_t vmdk_find_offset_in_cluster(VmdkExtent *extent,
+                                                   int64_t offset)
+{
+    uint64_t extent_begin_offset, extent_relative_offset;
+    uint64_t cluster_size = extent->cluster_sectors * BDRV_SECTOR_SIZE;
+
+    extent_begin_offset =
+        (extent->end_sector - extent->sectors) * BDRV_SECTOR_SIZE;
+    extent_relative_offset = offset - extent_begin_offset;
+    return extent_relative_offset % cluster_size;
+}
+
 static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent)
 {
     char *desc;
@@ -1266,18 +1278,6 @@ static VmdkExtent *find_extent(BDRVVmdkState *s,
     return NULL;
 }
 
-static inline uint64_t vmdk_find_offset_in_cluster(VmdkExtent *extent,
-                                                   int64_t offset)
-{
-    uint64_t extent_begin_offset, extent_relative_offset;
-    uint64_t cluster_size = extent->cluster_sectors * BDRV_SECTOR_SIZE;
-
-    extent_begin_offset =
-        (extent->end_sector - extent->sectors) * BDRV_SECTOR_SIZE;
-    extent_relative_offset = offset - extent_begin_offset;
-    return extent_relative_offset % cluster_size;
-}
-
 static inline uint64_t vmdk_find_index_in_cluster(VmdkExtent *extent,
                                                   int64_t sector_num)
 {
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v7 2/8] vmdk: Rename get_whole_cluster() to vmdk_perform_cow()
  2017-06-29  9:23 [Qemu-devel] [PATCH v7 0/8] Optimize VMDK I/O by allocating multiple clusters Ashijeet Acharya
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 1/8] vmdk: Move vmdk_find_offset_in_cluster() to the top Ashijeet Acharya
@ 2017-06-29  9:23 ` Ashijeet Acharya
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 3/8] vmdk: Rename get_cluster_offset() to vmdk_get_cluster_offset() Ashijeet Acharya
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Ashijeet Acharya @ 2017-06-29  9:23 UTC (permalink / raw)
  To: famz
  Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block, Ashijeet Acharya

Rename the existing function get_whole_cluster() to vmdk_perform_cow()
as its sole purpose is to perform COW for the first and the last
allocated clusters if needed.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
---
 block/vmdk.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 22be887..73ae786 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1028,8 +1028,8 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
     }
 }
 
-/**
- * get_whole_cluster
+/*
+ * vmdk_perform_cow
  *
  * Copy backing file's cluster that covers @sector_num, otherwise write zero,
  * to the cluster at @cluster_sector_num.
@@ -1037,13 +1037,18 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
  * If @skip_start_sector < @skip_end_sector, the relative range
  * [@skip_start_sector, @skip_end_sector) is not copied or written, and leave
  * it for call to write user data in the request.
+ *
+ * Returns:
+ *   VMDK_OK:       on success
+ *
+ *   VMDK_ERROR:    in error cases
  */
-static int get_whole_cluster(BlockDriverState *bs,
-                             VmdkExtent *extent,
-                             uint64_t cluster_offset,
-                             uint64_t offset,
-                             uint64_t skip_start_bytes,
-                             uint64_t skip_end_bytes)
+static int vmdk_perform_cow(BlockDriverState *bs,
+                            VmdkExtent *extent,
+                            uint64_t cluster_offset,
+                            uint64_t offset,
+                            uint64_t skip_start_bytes,
+                            uint64_t skip_end_bytes)
 {
     int ret = VMDK_OK;
     int64_t cluster_bytes;
@@ -1244,7 +1249,7 @@ static int get_cluster_offset(BlockDriverState *bs,
          * This problem may occur because of insufficient space on host disk
          * or inappropriate VM shutdown.
          */
-        ret = get_whole_cluster(bs, extent, cluster_sector * BDRV_SECTOR_SIZE,
+        ret = vmdk_perform_cow(bs, extent, cluster_sector * BDRV_SECTOR_SIZE,
                                 offset, skip_start_bytes, skip_end_bytes);
         if (ret) {
             return ret;
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v7 3/8] vmdk: Rename get_cluster_offset() to vmdk_get_cluster_offset()
  2017-06-29  9:23 [Qemu-devel] [PATCH v7 0/8] Optimize VMDK I/O by allocating multiple clusters Ashijeet Acharya
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 1/8] vmdk: Move vmdk_find_offset_in_cluster() to the top Ashijeet Acharya
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 2/8] vmdk: Rename get_whole_cluster() to vmdk_perform_cow() Ashijeet Acharya
@ 2017-06-29  9:23 ` Ashijeet Acharya
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 4/8] vmdk: Factor out metadata loading code out of vmdk_get_cluster_offset() Ashijeet Acharya
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Ashijeet Acharya @ 2017-06-29  9:23 UTC (permalink / raw)
  To: famz
  Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block, Ashijeet Acharya

Rename the existing get_cluster_offset() to vmdk_get_cluster_offset()
and update name in all the callers accordingly.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
---
 block/vmdk.c | 46 +++++++++++++++++++++++-----------------------
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 73ae786..f403981 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1144,7 +1144,7 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
 }
 
 /**
- * get_cluster_offset
+ * vmdk_get_cluster_offset
  *
  * Look up cluster offset in extent file by sector number, and store in
  * @cluster_offset.
@@ -1163,14 +1163,14 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
  *          VMDK_UNALLOC if cluster is not mapped and @allocate is false.
  *          VMDK_ERROR if failed.
  */
-static int get_cluster_offset(BlockDriverState *bs,
-                              VmdkExtent *extent,
-                              VmdkMetaData *m_data,
-                              uint64_t offset,
-                              bool allocate,
-                              uint64_t *cluster_offset,
-                              uint64_t skip_start_bytes,
-                              uint64_t skip_end_bytes)
+static int vmdk_get_cluster_offset(BlockDriverState *bs,
+                                   VmdkExtent *extent,
+                                   VmdkMetaData *m_data,
+                                   uint64_t offset,
+                                   bool allocate,
+                                   uint64_t *cluster_offset,
+                                   uint64_t skip_start_bytes,
+                                   uint64_t skip_end_bytes)
 {
     unsigned int l1_index, l2_offset, l2_index;
     int min_index, i, j;
@@ -1304,9 +1304,9 @@ static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
         return 0;
     }
     qemu_co_mutex_lock(&s->lock);
-    ret = get_cluster_offset(bs, extent, NULL,
-                             sector_num * 512, false, &offset,
-                             0, 0);
+    ret = vmdk_get_cluster_offset(bs, extent, NULL,
+                                  sector_num * 512, false, &offset,
+                                  0, 0);
     qemu_co_mutex_unlock(&s->lock);
 
     index_in_cluster = vmdk_find_index_in_cluster(extent, sector_num);
@@ -1497,8 +1497,8 @@ vmdk_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
             ret = -EIO;
             goto fail;
         }
-        ret = get_cluster_offset(bs, extent, NULL,
-                                 offset, false, &cluster_offset, 0, 0);
+        ret = vmdk_get_cluster_offset(bs, extent, NULL,
+                                      offset, false, &cluster_offset, 0, 0);
         offset_in_cluster = vmdk_find_offset_in_cluster(extent, offset);
 
         n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
@@ -1584,10 +1584,10 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
         n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
                              - offset_in_cluster);
 
-        ret = get_cluster_offset(bs, extent, &m_data, offset,
-                                 !(extent->compressed || zeroed),
-                                 &cluster_offset, offset_in_cluster,
-                                 offset_in_cluster + n_bytes);
+        ret = vmdk_get_cluster_offset(bs, extent, &m_data, offset,
+                                      !(extent->compressed || zeroed),
+                                      &cluster_offset, offset_in_cluster,
+                                      offset_in_cluster + n_bytes);
         if (extent->compressed) {
             if (ret == VMDK_OK) {
                 /* Refuse write to allocated cluster for streamOptimized */
@@ -1596,8 +1596,8 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
                 return -EIO;
             } else {
                 /* allocate */
-                ret = get_cluster_offset(bs, extent, &m_data, offset,
-                                         true, &cluster_offset, 0, 0);
+                ret = vmdk_get_cluster_offset(bs, extent, &m_data, offset,
+                                              true, &cluster_offset, 0, 0);
             }
         }
         if (ret == VMDK_ERROR) {
@@ -2229,9 +2229,9 @@ static int vmdk_check(BlockDriverState *bs, BdrvCheckResult *result,
                     sector_num);
             break;
         }
-        ret = get_cluster_offset(bs, extent, NULL,
-                                 sector_num << BDRV_SECTOR_BITS,
-                                 false, &cluster_offset, 0, 0);
+        ret = vmdk_get_cluster_offset(bs, extent, NULL,
+                                      sector_num << BDRV_SECTOR_BITS,
+                                      false, &cluster_offset, 0, 0);
         if (ret == VMDK_ERROR) {
             fprintf(stderr,
                     "ERROR: could not get cluster_offset for sector %"
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v7 4/8] vmdk: Factor out metadata loading code out of vmdk_get_cluster_offset()
  2017-06-29  9:23 [Qemu-devel] [PATCH v7 0/8] Optimize VMDK I/O by allocating multiple clusters Ashijeet Acharya
                   ` (2 preceding siblings ...)
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 3/8] vmdk: Rename get_cluster_offset() to vmdk_get_cluster_offset() Ashijeet Acharya
@ 2017-06-29  9:23 ` Ashijeet Acharya
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 5/8] vmdk: Set maximum bytes allocated in one cycle Ashijeet Acharya
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Ashijeet Acharya @ 2017-06-29  9:23 UTC (permalink / raw)
  To: famz
  Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block, Ashijeet Acharya

Move the cluster tables loading code out of the existing
vmdk_get_cluster_offset() function and implement it in separate
get_cluster_table() and vmdk_l2load() functions.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
---
 block/vmdk.c | 153 ++++++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 105 insertions(+), 48 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index f403981..5647f53 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1143,6 +1143,105 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
     return VMDK_OK;
 }
 
+/*
+ * vmdk_l2load
+ *
+ * Load a new L2 table into memory. If the table is in the cache, the cache
+ * is used; otherwise the L2 table is loaded from the image file.
+ *
+ * Returns:
+ *   VMDK_OK:       on success
+ *   VMDK_ERROR:    in error cases
+ */
+static int vmdk_l2load(VmdkExtent *extent, uint64_t offset, int l2_offset,
+                       uint32_t **new_l2_table, int *new_l2_index)
+{
+    int min_index, i, j;
+    uint32_t *l2_table;
+    uint32_t min_count;
+
+    for (i = 0; i < L2_CACHE_SIZE; i++) {
+        if (l2_offset == extent->l2_cache_offsets[i]) {
+            /* increment the hit count */
+            if (++extent->l2_cache_counts[i] == UINT32_MAX) {
+                for (j = 0; j < L2_CACHE_SIZE; j++) {
+                    extent->l2_cache_counts[j] >>= 1;
+                }
+            }
+            l2_table = extent->l2_cache + (i * extent->l2_size);
+            goto found;
+        }
+    }
+    /* not found: load a new entry in the least used one */
+    min_index = 0;
+    min_count = UINT32_MAX;
+    for (i = 0; i < L2_CACHE_SIZE; i++) {
+        if (extent->l2_cache_counts[i] < min_count) {
+            min_count = extent->l2_cache_counts[i];
+            min_index = i;
+        }
+    }
+    l2_table = extent->l2_cache + (min_index * extent->l2_size);
+    if (bdrv_pread(extent->file,
+                (int64_t)l2_offset * 512,
+                l2_table,
+                extent->l2_size * sizeof(uint32_t)
+            ) != extent->l2_size * sizeof(uint32_t)) {
+        return VMDK_ERROR;
+    }
+
+    extent->l2_cache_offsets[min_index] = l2_offset;
+    extent->l2_cache_counts[min_index] = 1;
+found:
+    *new_l2_index = ((offset >> 9) / extent->cluster_sectors) % extent->l2_size;
+    *new_l2_table = l2_table;
+
+    return VMDK_OK;
+}
+
+/*
+ * get_cluster_table
+ *
+ * For a given offset, load (and allocate if needed) the l2 table.
+ *
+ * Returns:
+ *   VMDK_OK:        on success
+ *
+ *   VMDK_UNALLOC:   if cluster is not mapped
+ *
+ *   VMDK_ERROR:     in error cases
+ */
+static int get_cluster_table(VmdkExtent *extent, uint64_t offset,
+                             int *new_l1_index, int *new_l2_offset,
+                             int *new_l2_index, uint32_t **new_l2_table)
+{
+    int l1_index, l2_offset, l2_index;
+    uint32_t *l2_table;
+    int ret;
+
+    offset -= (extent->end_sector - extent->sectors) * SECTOR_SIZE;
+    l1_index = (offset >> 9) / extent->l1_entry_sectors;
+    if (l1_index >= extent->l1_size) {
+        return VMDK_ERROR;
+    }
+    l2_offset = extent->l1_table[l1_index];
+    if (!l2_offset) {
+        return VMDK_UNALLOC;
+    }
+
+    ret = vmdk_l2load(extent, offset, l2_offset, &l2_table, &l2_index);
+    if (ret < 0) {
+        return ret;
+    }
+
+    *new_l1_index = l1_index;
+    *new_l2_offset = l2_offset;
+    *new_l2_index = l2_index;
+    *new_l2_table = l2_table;
+
+    return VMDK_OK;
+}
+
 /**
  * vmdk_get_cluster_offset
  *
@@ -1172,66 +1271,24 @@ static int vmdk_get_cluster_offset(BlockDriverState *bs,
                                    uint64_t skip_start_bytes,
                                    uint64_t skip_end_bytes)
 {
-    unsigned int l1_index, l2_offset, l2_index;
-    int min_index, i, j;
-    uint32_t min_count, *l2_table;
+    int l1_index, l2_offset, l2_index;
+    uint32_t *l2_table;
     bool zeroed = false;
     int64_t ret;
     int64_t cluster_sector;
 
-    if (m_data) {
-        m_data->valid = 0;
-    }
     if (extent->flat) {
         *cluster_offset = extent->flat_start_offset;
         return VMDK_OK;
     }
 
-    offset -= (extent->end_sector - extent->sectors) * SECTOR_SIZE;
-    l1_index = (offset >> 9) / extent->l1_entry_sectors;
-    if (l1_index >= extent->l1_size) {
-        return VMDK_ERROR;
-    }
-    l2_offset = extent->l1_table[l1_index];
-    if (!l2_offset) {
-        return VMDK_UNALLOC;
-    }
-    for (i = 0; i < L2_CACHE_SIZE; i++) {
-        if (l2_offset == extent->l2_cache_offsets[i]) {
-            /* increment the hit count */
-            if (++extent->l2_cache_counts[i] == 0xffffffff) {
-                for (j = 0; j < L2_CACHE_SIZE; j++) {
-                    extent->l2_cache_counts[j] >>= 1;
-                }
-            }
-            l2_table = extent->l2_cache + (i * extent->l2_size);
-            goto found;
-        }
-    }
-    /* not found: load a new entry in the least used one */
-    min_index = 0;
-    min_count = 0xffffffff;
-    for (i = 0; i < L2_CACHE_SIZE; i++) {
-        if (extent->l2_cache_counts[i] < min_count) {
-            min_count = extent->l2_cache_counts[i];
-            min_index = i;
-        }
-    }
-    l2_table = extent->l2_cache + (min_index * extent->l2_size);
-    if (bdrv_pread(extent->file,
-                (int64_t)l2_offset * 512,
-                l2_table,
-                extent->l2_size * sizeof(uint32_t)
-            ) != extent->l2_size * sizeof(uint32_t)) {
-        return VMDK_ERROR;
+    ret = get_cluster_table(extent, offset, &l1_index, &l2_offset,
+                            &l2_index, &l2_table);
+    if (ret < 0) {
+        return ret;
     }
 
-    extent->l2_cache_offsets[min_index] = l2_offset;
-    extent->l2_cache_counts[min_index] = 1;
- found:
-    l2_index = ((offset >> 9) / extent->cluster_sectors) % extent->l2_size;
     cluster_sector = le32_to_cpu(l2_table[l2_index]);
-
     if (extent->has_zero_grain && cluster_sector == VMDK_GTE_ZEROED) {
         zeroed = true;
     }
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v7 5/8] vmdk: Set maximum bytes allocated in one cycle
  2017-06-29  9:23 [Qemu-devel] [PATCH v7 0/8] Optimize VMDK I/O by allocating multiple clusters Ashijeet Acharya
                   ` (3 preceding siblings ...)
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 4/8] vmdk: Factor out metadata loading code out of vmdk_get_cluster_offset() Ashijeet Acharya
@ 2017-06-29  9:23 ` Ashijeet Acharya
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 6/8] vmdk: New functions to assist allocating multiple clusters Ashijeet Acharya
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Ashijeet Acharya @ 2017-06-29  9:23 UTC (permalink / raw)
  To: famz
  Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block, Ashijeet Acharya

Set the maximum bytes allowed to get allocated at once to be not more
than the extent size boundary to handle writes at two separate extents
appropriately.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
---
 block/vmdk.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 5647f53..fe2046b 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1624,6 +1624,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
     uint64_t cluster_offset;
     uint64_t bytes_done = 0;
     VmdkMetaData m_data;
+    uint64_t extent_end;
 
     if (DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE) > bs->total_sectors) {
         error_report("Wrong offset: offset=0x%" PRIx64
@@ -1637,9 +1638,17 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
         if (!extent) {
             return -EIO;
         }
+        extent_end = extent->end_sector * BDRV_SECTOR_SIZE;
+
         offset_in_cluster = vmdk_find_offset_in_cluster(extent, offset);
-        n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
-                             - offset_in_cluster);
+
+        /* truncate n_bytes to first cluster because we need to perform COW */
+        if (offset_in_cluster > 0) {
+            n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
+                                 - offset_in_cluster);
+        } else {
+            n_bytes = MIN(bytes, extent_end - offset);
+        }
 
         ret = vmdk_get_cluster_offset(bs, extent, &m_data, offset,
                                       !(extent->compressed || zeroed),
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v7 6/8] vmdk: New functions to assist allocating multiple clusters
  2017-06-29  9:23 [Qemu-devel] [PATCH v7 0/8] Optimize VMDK I/O by allocating multiple clusters Ashijeet Acharya
                   ` (4 preceding siblings ...)
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 5/8] vmdk: Set maximum bytes allocated in one cycle Ashijeet Acharya
@ 2017-06-29  9:23 ` Ashijeet Acharya
  2017-07-27 12:48   ` Fam Zheng
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 7/8] vmdk: Update metadata for " Ashijeet Acharya
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 11+ messages in thread
From: Ashijeet Acharya @ 2017-06-29  9:23 UTC (permalink / raw)
  To: famz
  Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block, Ashijeet Acharya

Introduce two new helper functions handle_alloc() and
vmdk_alloc_cluster_offset(). handle_alloc() helps to allocate multiple
clusters at once starting from a given offset on disk and performs COW
if necessary for first and last allocated clusters.
vmdk_alloc_cluster_offset() helps to return the offset of the first of
the many newly allocated clusters. Also, provide proper documentation
for both.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
---
 block/vmdk.c | 200 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 190 insertions(+), 10 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index fe2046b..277db16 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -136,6 +136,7 @@ typedef struct VmdkMetaData {
     unsigned int l2_offset;
     int valid;
     uint32_t *l2_cache_entry;
+    uint32_t nb_clusters;
 } VmdkMetaData;
 
 typedef struct VmdkGrainMarker {
@@ -1242,6 +1243,182 @@ static int get_cluster_table(VmdkExtent *extent, uint64_t offset,
     return VMDK_OK;
 }
 
+/*
+ * vmdk_handle_alloc
+ *
+ * Allocate new clusters for an area that either is yet unallocated or needs a
+ * copy on write.
+ *
+ * Returns:
+ *   VMDK_OK:       if new clusters were allocated, *bytes may be decreased if
+ *                  the new allocation doesn't cover all of the requested area.
+ *                  *cluster_offset is updated to contain the offset of the
+ *                  first newly allocated cluster.
+ *
+ *   VMDK_UNALLOC:  if no clusters could be allocated. *cluster_offset is left
+ *                  unchanged.
+ *
+ *   VMDK_ERROR:    in error cases
+ */
+static int vmdk_handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
+                             uint64_t offset, uint64_t *cluster_offset,
+                             int64_t *bytes, VmdkMetaData *m_data,
+                             bool allocate, uint32_t *total_alloc_clusters)
+{
+    int l1_index, l2_offset, l2_index;
+    uint32_t *l2_table;
+    uint32_t cluster_sector;
+    uint32_t nb_clusters;
+    bool zeroed = false;
+    uint64_t skip_start_bytes, skip_end_bytes;
+    int ret;
+
+    ret = get_cluster_table(extent, offset, &l1_index, &l2_offset,
+                            &l2_index, &l2_table);
+    if (ret < 0) {
+        return ret;
+    }
+
+    cluster_sector = le32_to_cpu(l2_table[l2_index]);
+
+    skip_start_bytes = vmdk_find_offset_in_cluster(extent, offset);
+    /* Calculate the number of clusters to look for. Here we truncate the last
+     * cluster, i.e. 1 less than the actual value calculated as we may need to
+     * perform COW for the last one. */
+    nb_clusters = DIV_ROUND_UP(skip_start_bytes + *bytes,
+                               extent->cluster_sectors << BDRV_SECTOR_BITS) - 1;
+
+    nb_clusters = MIN(nb_clusters, extent->l2_size - l2_index);
+    assert(nb_clusters <= INT_MAX);
+
+    /* update bytes according to final nb_clusters value */
+    if (nb_clusters != 0) {
+        *bytes = ((nb_clusters * extent->cluster_sectors) << BDRV_SECTOR_BITS)
+                 - skip_start_bytes;
+    } else {
+        nb_clusters = 1;
+    }
+    *total_alloc_clusters += nb_clusters;
+
+    /* we need to use MIN() for basically 3 cases that arise :
+     * 1. alloc very first cluster : here skip_start_bytes >= 0 and
+     *    *bytes <= cluster_size.
+     * 2. alloc middle clusters : here *bytes is a perfect multiple of
+     *    cluster_size and skip_start_bytes is 0.
+     * 3. alloc very last cluster : here *bytes <= cluster_size and
+     *    skip_start_bytes is 0
+     */
+    skip_end_bytes = skip_start_bytes + MIN(*bytes,
+                     extent->cluster_sectors * BDRV_SECTOR_SIZE
+                                    - skip_start_bytes);
+
+    if (extent->has_zero_grain && cluster_sector == VMDK_GTE_ZEROED) {
+        zeroed = true;
+    }
+
+    if (!cluster_sector || zeroed) {
+        if (!allocate) {
+            return zeroed ? VMDK_ZEROED : VMDK_UNALLOC;
+        }
+
+        cluster_sector = extent->next_cluster_sector;
+        extent->next_cluster_sector += extent->cluster_sectors
+                                                * nb_clusters;
+
+        ret = vmdk_perform_cow(bs, extent, cluster_sector * BDRV_SECTOR_SIZE,
+                               offset, skip_start_bytes,
+                               skip_end_bytes);
+        if (ret < 0) {
+            return ret;
+        }
+        if (m_data) {
+            m_data->valid = 1;
+            m_data->l1_index = l1_index;
+            m_data->l2_index = l2_index;
+            m_data->l2_offset = l2_offset;
+            m_data->l2_cache_entry = &l2_table[l2_index];
+            m_data->nb_clusters = nb_clusters;
+        }
+    }
+    *cluster_offset = cluster_sector << BDRV_SECTOR_BITS;
+    return VMDK_OK;
+}
+
+/*
+ * vmdk_alloc_clusters
+ *
+ * For a given offset on the virtual disk, find the cluster offset in vmdk
+ * file. If the offset is not found, allocate a new cluster.
+ *
+ * If the cluster is newly allocated, m_data->nb_clusters is set to the number
+ * of contiguous clusters that have been allocated. In this case, the other
+ * fields of m_data are valid and contain information about the first allocated
+ * cluster.
+ *
+ * Returns:
+ *
+ *   VMDK_OK:           on success and @cluster_offset was set
+ *
+ *   VMDK_UNALLOC:      if no clusters were allocated and @cluster_offset is
+ *                      set to zero
+ *
+ *   VMDK_ERROR:        in error cases
+ */
+static int vmdk_alloc_clusters(BlockDriverState *bs,
+                               VmdkExtent *extent,
+                               VmdkMetaData *m_data, uint64_t offset,
+                               bool allocate, uint64_t *cluster_offset,
+                               int64_t bytes,
+                               uint32_t *total_alloc_clusters)
+{
+    uint64_t start, remaining;
+    uint64_t new_cluster_offset;
+    int64_t n_bytes;
+    int ret;
+
+    if (extent->flat) {
+        *cluster_offset = extent->flat_start_offset;
+        return VMDK_OK;
+    }
+
+    start = offset;
+    remaining = bytes;
+    new_cluster_offset = 0;
+    *cluster_offset = 0;
+    n_bytes = 0;
+    if (m_data) {
+        m_data->valid = 0;
+    }
+
+    /* due to L2 table margins all bytes may not get allocated at once */
+    while (true) {
+
+        if (!*cluster_offset) {
+            *cluster_offset = new_cluster_offset;
+        }
+
+        start              += n_bytes;
+        remaining          -= n_bytes;
+        new_cluster_offset += n_bytes;
+
+        if (remaining == 0) {
+            break;
+        }
+
+        n_bytes = remaining;
+
+        ret = vmdk_handle_alloc(bs, extent, start, &new_cluster_offset, &n_bytes,
+                                m_data, allocate, total_alloc_clusters);
+
+        if (ret < 0) {
+            return ret;
+
+        }
+    }
+
+    return VMDK_OK;
+}
+
 /**
  * vmdk_get_cluster_offset
  *
@@ -1625,6 +1802,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
     uint64_t bytes_done = 0;
     VmdkMetaData m_data;
     uint64_t extent_end;
+    uint32_t total_alloc_clusters = 0;
 
     if (DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE) > bs->total_sectors) {
         error_report("Wrong offset: offset=0x%" PRIx64
@@ -1650,10 +1828,10 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
             n_bytes = MIN(bytes, extent_end - offset);
         }
 
-        ret = vmdk_get_cluster_offset(bs, extent, &m_data, offset,
-                                      !(extent->compressed || zeroed),
-                                      &cluster_offset, offset_in_cluster,
-                                      offset_in_cluster + n_bytes);
+        ret = vmdk_alloc_clusters(bs, extent, &m_data, offset,
+                                  !(extent->compressed || zeroed),
+                                  &cluster_offset, n_bytes,
+                                  &total_alloc_clusters);
         if (extent->compressed) {
             if (ret == VMDK_OK) {
                 /* Refuse write to allocated cluster for streamOptimized */
@@ -1662,8 +1840,9 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
                 return -EIO;
             } else {
                 /* allocate */
-                ret = vmdk_get_cluster_offset(bs, extent, &m_data, offset,
-                                              true, &cluster_offset, 0, 0);
+                ret = vmdk_alloc_clusters(bs, extent, &m_data, offset,
+                                          true, &cluster_offset, n_bytes,
+                                          &total_alloc_clusters);
             }
         }
         if (ret == VMDK_ERROR) {
@@ -1671,10 +1850,11 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
         }
         if (zeroed) {
             /* Do zeroed write, buf is ignored */
-            if (extent->has_zero_grain &&
-                    offset_in_cluster == 0 &&
-                    n_bytes >= extent->cluster_sectors * BDRV_SECTOR_SIZE) {
-                n_bytes = extent->cluster_sectors * BDRV_SECTOR_SIZE;
+            if (extent->has_zero_grain && offset_in_cluster == 0 &&
+                    n_bytes >= extent->cluster_sectors * BDRV_SECTOR_SIZE *
+                        total_alloc_clusters) {
+                n_bytes = extent->cluster_sectors * BDRV_SECTOR_SIZE *
+                                        total_alloc_clusters;
                 if (!zero_dry_run) {
                     /* update L2 tables */
                     if (vmdk_L2update(extent, &m_data, VMDK_GTE_ZEROED)
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v7 7/8] vmdk: Update metadata for multiple clusters
  2017-06-29  9:23 [Qemu-devel] [PATCH v7 0/8] Optimize VMDK I/O by allocating multiple clusters Ashijeet Acharya
                   ` (5 preceding siblings ...)
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 6/8] vmdk: New functions to assist allocating multiple clusters Ashijeet Acharya
@ 2017-06-29  9:23 ` Ashijeet Acharya
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 8/8] vmdk: Make vmdk_get_cluster_offset() return cluster offset only Ashijeet Acharya
  2017-07-27 12:49 ` [Qemu-devel] [PATCH v7 0/8] Optimize VMDK I/O by allocating multiple clusters Fam Zheng
  8 siblings, 0 replies; 11+ messages in thread
From: Ashijeet Acharya @ 2017-06-29  9:23 UTC (permalink / raw)
  To: famz
  Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block, Ashijeet Acharya

Include a next pointer in VmdkMetaData struct to point to the previous
allocated L2 table. Modify vmdk_L2update to start updating metadata for
allocation of multiple clusters at once.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
---
 block/vmdk.c | 128 ++++++++++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 101 insertions(+), 27 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 277db16..60b8adc 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -137,6 +137,8 @@ typedef struct VmdkMetaData {
     int valid;
     uint32_t *l2_cache_entry;
     uint32_t nb_clusters;
+    uint32_t offset;
+    struct VmdkMetaData *next;
 } VmdkMetaData;
 
 typedef struct VmdkGrainMarker {
@@ -1116,34 +1118,87 @@ exit:
     return ret;
 }
 
-static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
-                         uint32_t offset)
+static int vmdk_alloc_cluster_link_l2(VmdkExtent *extent,
+                                      VmdkMetaData *m_data, bool zeroed)
 {
-    offset = cpu_to_le32(offset);
+    int i;
+    uint32_t offset, temp_offset;
+    int *l2_table_array;
+    int l2_array_size;
+
+    if (zeroed) {
+        temp_offset = VMDK_GTE_ZEROED;
+    } else {
+        temp_offset = m_data->offset;
+    }
+
+    l2_array_size = sizeof(uint32_t) * m_data->nb_clusters;
+    l2_table_array = qemu_try_blockalign(extent->file->bs,
+                                         QEMU_ALIGN_UP(l2_array_size,
+                                                       BDRV_SECTOR_SIZE));
+    if (l2_table_array == NULL) {
+        return VMDK_ERROR;
+    }
+    memset(l2_table_array, 0, QEMU_ALIGN_UP(l2_array_size, BDRV_SECTOR_SIZE));
     /* update L2 table */
+    offset = temp_offset;
+    for (i = 0; i < m_data->nb_clusters; i++) {
+        l2_table_array[i] = cpu_to_le32(offset);
+        if (!zeroed) {
+            offset += extent->cluster_sectors;
+        }
+    }
     if (bdrv_pwrite_sync(extent->file,
-                ((int64_t)m_data->l2_offset * 512)
-                    + (m_data->l2_index * sizeof(offset)),
-                &offset, sizeof(offset)) < 0) {
+                         ((int64_t)m_data->l2_offset * 512)
+                         + ((m_data->l2_index) * sizeof(offset)),
+                         l2_table_array, l2_array_size) < 0) {
         return VMDK_ERROR;
     }
     /* update backup L2 table */
     if (extent->l1_backup_table_offset != 0) {
         m_data->l2_offset = extent->l1_backup_table[m_data->l1_index];
         if (bdrv_pwrite_sync(extent->file,
-                    ((int64_t)m_data->l2_offset * 512)
-                        + (m_data->l2_index * sizeof(offset)),
-                    &offset, sizeof(offset)) < 0) {
+                             ((int64_t)m_data->l2_offset * 512)
+                             + ((m_data->l2_index) * sizeof(offset)),
+                             l2_table_array, l2_array_size) < 0) {
             return VMDK_ERROR;
         }
     }
+
+    offset = temp_offset;
     if (m_data->l2_cache_entry) {
-        *m_data->l2_cache_entry = offset;
+        for (i = 0; i < m_data->nb_clusters; i++) {
+            *m_data->l2_cache_entry = cpu_to_le32(offset);
+            m_data->l2_cache_entry++;
+
+            if (!zeroed) {
+                offset += extent->cluster_sectors;
+            }
+        }
     }
 
+    qemu_vfree(l2_table_array);
     return VMDK_OK;
 }
 
+static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data,
+                         bool zeroed)
+{
+    int ret;
+
+    while (m_data->next != NULL) {
+
+        ret = vmdk_alloc_cluster_link_l2(extent, m_data, zeroed);
+        if (ret < 0) {
+            return ret;
+        }
+
+        m_data = m_data->next;
+     }
+
+     return VMDK_OK;
+}
+
 /*
  * vmdk_l2load
  *
@@ -1260,9 +1315,10 @@ static int get_cluster_table(VmdkExtent *extent, uint64_t offset,
  *
  *   VMDK_ERROR:    in error cases
  */
+
 static int vmdk_handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
                              uint64_t offset, uint64_t *cluster_offset,
-                             int64_t *bytes, VmdkMetaData *m_data,
+                             int64_t *bytes, VmdkMetaData **m_data,
                              bool allocate, uint32_t *total_alloc_clusters)
 {
     int l1_index, l2_offset, l2_index;
@@ -1271,6 +1327,7 @@ static int vmdk_handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
     uint32_t nb_clusters;
     bool zeroed = false;
     uint64_t skip_start_bytes, skip_end_bytes;
+    VmdkMetaData *old_m_data;
     int ret;
 
     ret = get_cluster_table(extent, offset, &l1_index, &l2_offset,
@@ -1331,13 +1388,21 @@ static int vmdk_handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
         if (ret < 0) {
             return ret;
         }
-        if (m_data) {
-            m_data->valid = 1;
-            m_data->l1_index = l1_index;
-            m_data->l2_index = l2_index;
-            m_data->l2_offset = l2_offset;
-            m_data->l2_cache_entry = &l2_table[l2_index];
-            m_data->nb_clusters = nb_clusters;
+
+        if (m_data != NULL) {
+            old_m_data = *m_data;
+            *m_data = g_malloc0(sizeof(**m_data));
+
+            **m_data = (VmdkMetaData) {
+                .valid            =    1,
+                .l1_index         =    l1_index,
+                .l2_index         =    l2_index,
+                .l2_offset        =    l2_offset,
+                .l2_cache_entry   =    &l2_table[l2_index],
+                .nb_clusters      =    nb_clusters,
+                .offset           =    cluster_sector,
+                .next             =    old_m_data,
+            };
         }
     }
     *cluster_offset = cluster_sector << BDRV_SECTOR_BITS;
@@ -1366,7 +1431,7 @@ static int vmdk_handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
  */
 static int vmdk_alloc_clusters(BlockDriverState *bs,
                                VmdkExtent *extent,
-                               VmdkMetaData *m_data, uint64_t offset,
+                               VmdkMetaData **m_data, uint64_t offset,
                                bool allocate, uint64_t *cluster_offset,
                                int64_t bytes,
                                uint32_t *total_alloc_clusters)
@@ -1386,8 +1451,8 @@ static int vmdk_alloc_clusters(BlockDriverState *bs,
     new_cluster_offset = 0;
     *cluster_offset = 0;
     n_bytes = 0;
-    if (m_data) {
-        m_data->valid = 0;
+    if (*m_data) {
+        (*m_data)->valid = 0;
     }
 
     /* due to L2 table margins all bytes may not get allocated at once */
@@ -1800,10 +1865,12 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
     int64_t offset_in_cluster, n_bytes;
     uint64_t cluster_offset;
     uint64_t bytes_done = 0;
-    VmdkMetaData m_data;
     uint64_t extent_end;
+    VmdkMetaData *m_data;
     uint32_t total_alloc_clusters = 0;
 
+    m_data = g_new0(VmdkMetaData, 1);
+
     if (DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE) > bs->total_sectors) {
         error_report("Wrong offset: offset=0x%" PRIx64
                      " total_sectors=0x%" PRIx64,
@@ -1812,6 +1879,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
     }
 
     while (bytes > 0) {
+        m_data->next = NULL;
         extent = find_extent(s, offset >> BDRV_SECTOR_BITS, extent);
         if (!extent) {
             return -EIO;
@@ -1857,7 +1925,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
                                         total_alloc_clusters;
                 if (!zero_dry_run) {
                     /* update L2 tables */
-                    if (vmdk_L2update(extent, &m_data, VMDK_GTE_ZEROED)
+                    if (vmdk_L2update(extent, m_data, zeroed)
                             != VMDK_OK) {
                         return -EIO;
                     }
@@ -1871,11 +1939,9 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
             if (ret) {
                 return ret;
             }
-            if (m_data.valid) {
+            if (m_data->valid) {
                 /* update L2 tables */
-                if (vmdk_L2update(extent, &m_data,
-                                  cluster_offset >> BDRV_SECTOR_BITS)
-                        != VMDK_OK) {
+                if (vmdk_L2update(extent, m_data, zeroed) != VMDK_OK) {
                     return -EIO;
                 }
             }
@@ -1884,6 +1950,13 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
         offset += n_bytes;
         bytes_done += n_bytes;
 
+        while (m_data->next != NULL) {
+            VmdkMetaData *next;
+            next = m_data->next;
+            g_free(m_data);
+            m_data = next;
+        }
+
         /* update CID on the first write every time the virtual disk is
          * opened */
         if (!s->cid_updated) {
@@ -1894,6 +1967,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
             s->cid_updated = true;
         }
     }
+    g_free(m_data);
     return 0;
 }
 
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v7 8/8] vmdk: Make vmdk_get_cluster_offset() return cluster offset only
  2017-06-29  9:23 [Qemu-devel] [PATCH v7 0/8] Optimize VMDK I/O by allocating multiple clusters Ashijeet Acharya
                   ` (6 preceding siblings ...)
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 7/8] vmdk: Update metadata for " Ashijeet Acharya
@ 2017-06-29  9:23 ` Ashijeet Acharya
  2017-07-27 12:49 ` [Qemu-devel] [PATCH v7 0/8] Optimize VMDK I/O by allocating multiple clusters Fam Zheng
  8 siblings, 0 replies; 11+ messages in thread
From: Ashijeet Acharya @ 2017-06-29  9:23 UTC (permalink / raw)
  To: famz
  Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block, Ashijeet Acharya

vmdk_alloc_clusters() introduced earlier now handles the task of
allocating clusters and performing COW when needed. Thus we can change
vmdk_get_cluster_offset() to stick to the sole purpose of returning
cluster offset using sector number. Update the changes at all call
sites.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
---
 block/vmdk.c | 56 ++++++++++++--------------------------------------------
 1 file changed, 12 insertions(+), 44 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 60b8adc..d41fde9 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1493,25 +1493,16 @@ static int vmdk_alloc_clusters(BlockDriverState *bs,
  * For flat extents, the start offset as parsed from the description file is
  * returned.
  *
- * For sparse extents, look up in L1, L2 table. If allocate is true, return an
- * offset for a new cluster and update L2 cache. If there is a backing file,
- * COW is done before returning; otherwise, zeroes are written to the allocated
- * cluster. Both COW and zero writing skips the sector range
- * [@skip_start_sector, @skip_end_sector) passed in by caller, because caller
- * has new data to write there.
+ * For sparse extents, look up the L1, L2 table.
  *
  * Returns: VMDK_OK if cluster exists and mapped in the image.
- *          VMDK_UNALLOC if cluster is not mapped and @allocate is false.
- *          VMDK_ERROR if failed.
+ *          VMDK_UNALLOC if cluster is not mapped.
+ *          VMDK_ERROR if failed
  */
 static int vmdk_get_cluster_offset(BlockDriverState *bs,
                                    VmdkExtent *extent,
-                                   VmdkMetaData *m_data,
                                    uint64_t offset,
-                                   bool allocate,
-                                   uint64_t *cluster_offset,
-                                   uint64_t skip_start_bytes,
-                                   uint64_t skip_end_bytes)
+                                   uint64_t *cluster_offset)
 {
     int l1_index, l2_offset, l2_index;
     uint32_t *l2_table;
@@ -1536,31 +1527,9 @@ static int vmdk_get_cluster_offset(BlockDriverState *bs,
     }
 
     if (!cluster_sector || zeroed) {
-        if (!allocate) {
-            return zeroed ? VMDK_ZEROED : VMDK_UNALLOC;
-        }
-
-        cluster_sector = extent->next_cluster_sector;
-        extent->next_cluster_sector += extent->cluster_sectors;
-
-        /* First of all we write grain itself, to avoid race condition
-         * that may to corrupt the image.
-         * This problem may occur because of insufficient space on host disk
-         * or inappropriate VM shutdown.
-         */
-        ret = vmdk_perform_cow(bs, extent, cluster_sector * BDRV_SECTOR_SIZE,
-                                offset, skip_start_bytes, skip_end_bytes);
-        if (ret) {
-            return ret;
-        }
-        if (m_data) {
-            m_data->valid = 1;
-            m_data->l1_index = l1_index;
-            m_data->l2_index = l2_index;
-            m_data->l2_offset = l2_offset;
-            m_data->l2_cache_entry = &l2_table[l2_index];
-        }
+        return zeroed ? VMDK_ZEROED : VMDK_UNALLOC;
     }
+
     *cluster_offset = cluster_sector << BDRV_SECTOR_BITS;
     return VMDK_OK;
 }
@@ -1603,9 +1572,7 @@ static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
         return 0;
     }
     qemu_co_mutex_lock(&s->lock);
-    ret = vmdk_get_cluster_offset(bs, extent, NULL,
-                                  sector_num * 512, false, &offset,
-                                  0, 0);
+    ret = vmdk_get_cluster_offset(bs, extent, sector_num * 512, &offset);
     qemu_co_mutex_unlock(&s->lock);
 
     index_in_cluster = vmdk_find_index_in_cluster(extent, sector_num);
@@ -1796,13 +1763,14 @@ vmdk_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
             ret = -EIO;
             goto fail;
         }
-        ret = vmdk_get_cluster_offset(bs, extent, NULL,
-                                      offset, false, &cluster_offset, 0, 0);
+
         offset_in_cluster = vmdk_find_offset_in_cluster(extent, offset);
 
         n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
                              - offset_in_cluster);
 
+        ret = vmdk_get_cluster_offset(bs, extent, offset, &cluster_offset);
+
         if (ret != VMDK_OK) {
             /* if not allocated, try to read from parent image, if exist */
             if (bs->backing && ret != VMDK_ZEROED) {
@@ -2549,9 +2517,9 @@ static int vmdk_check(BlockDriverState *bs, BdrvCheckResult *result,
                     sector_num);
             break;
         }
-        ret = vmdk_get_cluster_offset(bs, extent, NULL,
+        ret = vmdk_get_cluster_offset(bs, extent,
                                       sector_num << BDRV_SECTOR_BITS,
-                                      false, &cluster_offset, 0, 0);
+                                      &cluster_offset);
         if (ret == VMDK_ERROR) {
             fprintf(stderr,
                     "ERROR: could not get cluster_offset for sector %"
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v7 6/8] vmdk: New functions to assist allocating multiple clusters
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 6/8] vmdk: New functions to assist allocating multiple clusters Ashijeet Acharya
@ 2017-07-27 12:48   ` Fam Zheng
  0 siblings, 0 replies; 11+ messages in thread
From: Fam Zheng @ 2017-07-27 12:48 UTC (permalink / raw)
  To: Ashijeet Acharya; +Cc: kwolf, jsnow, mreitz, stefanha, qemu-devel, qemu-block

On Thu, 06/29 14:53, Ashijeet Acharya wrote:
> +/*
> + * vmdk_handle_alloc
> + *
> + * Allocate new clusters for an area that either is yet unallocated or needs a
> + * copy on write.
> + *
> + * Returns:
> + *   VMDK_OK:       if new clusters were allocated, *bytes may be decreased if
> + *                  the new allocation doesn't cover all of the requested area.
> + *                  *cluster_offset is updated to contain the offset of the
> + *                  first newly allocated cluster.
> + *
> + *   VMDK_UNALLOC:  if no clusters could be allocated. *cluster_offset is left
> + *                  unchanged.
> + *
> + *   VMDK_ERROR:    in error cases
> + */
> +static int vmdk_handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
> +                             uint64_t offset, uint64_t *cluster_offset,
> +                             int64_t *bytes, VmdkMetaData *m_data,
> +                             bool allocate, uint32_t *total_alloc_clusters)

This was renamed "alloc_clusters_counter" in v6, looks like somehow you reverted
the change to v5.

> +{
> +    int l1_index, l2_offset, l2_index;
> +    uint32_t *l2_table;
> +    uint32_t cluster_sector;
> +    uint32_t nb_clusters;
> +    bool zeroed = false;
> +    uint64_t skip_start_bytes, skip_end_bytes;
> +    int ret;

Fam

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v7 0/8] Optimize VMDK I/O by allocating multiple clusters
  2017-06-29  9:23 [Qemu-devel] [PATCH v7 0/8] Optimize VMDK I/O by allocating multiple clusters Ashijeet Acharya
                   ` (7 preceding siblings ...)
  2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 8/8] vmdk: Make vmdk_get_cluster_offset() return cluster offset only Ashijeet Acharya
@ 2017-07-27 12:49 ` Fam Zheng
  8 siblings, 0 replies; 11+ messages in thread
From: Fam Zheng @ 2017-07-27 12:49 UTC (permalink / raw)
  To: Ashijeet Acharya; +Cc: kwolf, qemu-block, stefanha, qemu-devel, mreitz, jsnow

On Thu, 06/29 14:53, Ashijeet Acharya wrote:
> Changes in v7:
> - comment the use of MIN() in calculating skip_end_bytes
> - use extent->cluster_sectors instead of 128
> - place check for m_data != NULL
> - use g_new0(VmdkMetaData, 1) instead of g_malloc0(sizeof(*m_data))

There is a minor regression as I replied inline, otherwise:

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-07-27 12:50 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-29  9:23 [Qemu-devel] [PATCH v7 0/8] Optimize VMDK I/O by allocating multiple clusters Ashijeet Acharya
2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 1/8] vmdk: Move vmdk_find_offset_in_cluster() to the top Ashijeet Acharya
2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 2/8] vmdk: Rename get_whole_cluster() to vmdk_perform_cow() Ashijeet Acharya
2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 3/8] vmdk: Rename get_cluster_offset() to vmdk_get_cluster_offset() Ashijeet Acharya
2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 4/8] vmdk: Factor out metadata loading code out of vmdk_get_cluster_offset() Ashijeet Acharya
2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 5/8] vmdk: Set maximum bytes allocated in one cycle Ashijeet Acharya
2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 6/8] vmdk: New functions to assist allocating multiple clusters Ashijeet Acharya
2017-07-27 12:48   ` Fam Zheng
2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 7/8] vmdk: Update metadata for " Ashijeet Acharya
2017-06-29  9:23 ` [Qemu-devel] [PATCH v7 8/8] vmdk: Make vmdk_get_cluster_offset() return cluster offset only Ashijeet Acharya
2017-07-27 12:49 ` [Qemu-devel] [PATCH v7 0/8] Optimize VMDK I/O by allocating multiple clusters Fam Zheng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.