All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/5] qcow2: Discard freed clusters
@ 2013-06-13 11:47 Kevin Wolf
  2013-06-13 11:47 ` [Qemu-devel] [PATCH 1/5] Revert "block: Disable driver-specific options for 1.5" Kevin Wolf
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Kevin Wolf @ 2013-06-13 11:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

This series adds options to make qcow2 discard freed clusters, in several
categories. By default, only freed clusters related to snapshots (i.e. mainly
snapshot deletion) are discarded.

Kevin Wolf (5):
  Revert "block: Disable driver-specific options for 1.5"
  qcow2: Add refcount update reason to all callers
  qcow2: Options to enable discard for freed clusters
  qcow2: Batch discards
  block: Always enable discard on the protocol level

 block.c                  |   2 +-
 block/qcow2-cluster.c    |  38 +++++++++----
 block/qcow2-refcount.c   | 139 ++++++++++++++++++++++++++++++++++++++++-------
 block/qcow2-snapshot.c   |   6 +-
 block/qcow2.c            |  30 +++++++++-
 block/qcow2.h            |  32 ++++++++++-
 blockdev.c               | 118 ++--------------------------------------
 tests/qemu-iotests/group |   2 +-
 8 files changed, 214 insertions(+), 153 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 1/5] Revert "block: Disable driver-specific options for 1.5"
  2013-06-13 11:47 [Qemu-devel] [PATCH 0/5] qcow2: Discard freed clusters Kevin Wolf
@ 2013-06-13 11:47 ` Kevin Wolf
  2013-06-13 11:47 ` [Qemu-devel] [PATCH 2/5] qcow2: Add refcount update reason to all callers Kevin Wolf
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Kevin Wolf @ 2013-06-13 11:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

This reverts commit 8ec7d390b0d50b5e5b4b1d8dba7ba40d64a70875.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 blockdev.c               | 118 ++---------------------------------------------
 tests/qemu-iotests/group |   2 +-
 2 files changed, 5 insertions(+), 115 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 9937311..4cb592f 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1734,120 +1734,10 @@ QemuOptsList qemu_drive_opts = {
     .name = "drive",
     .head = QTAILQ_HEAD_INITIALIZER(qemu_drive_opts.head),
     .desc = {
-        {
-            .name = "bus",
-            .type = QEMU_OPT_NUMBER,
-            .help = "bus number",
-        },{
-            .name = "unit",
-            .type = QEMU_OPT_NUMBER,
-            .help = "unit number (i.e. lun for scsi)",
-        },{
-            .name = "if",
-            .type = QEMU_OPT_STRING,
-            .help = "interface (ide, scsi, sd, mtd, floppy, pflash, virtio)",
-        },{
-            .name = "index",
-            .type = QEMU_OPT_NUMBER,
-            .help = "index number",
-        },{
-            .name = "cyls",
-            .type = QEMU_OPT_NUMBER,
-            .help = "number of cylinders (ide disk geometry)",
-        },{
-            .name = "heads",
-            .type = QEMU_OPT_NUMBER,
-            .help = "number of heads (ide disk geometry)",
-        },{
-            .name = "secs",
-            .type = QEMU_OPT_NUMBER,
-            .help = "number of sectors (ide disk geometry)",
-        },{
-            .name = "trans",
-            .type = QEMU_OPT_STRING,
-            .help = "chs translation (auto, lba. none)",
-        },{
-            .name = "media",
-            .type = QEMU_OPT_STRING,
-            .help = "media type (disk, cdrom)",
-        },{
-            .name = "snapshot",
-            .type = QEMU_OPT_BOOL,
-            .help = "enable/disable snapshot mode",
-        },{
-            .name = "file",
-            .type = QEMU_OPT_STRING,
-            .help = "disk image",
-        },{
-            .name = "discard",
-            .type = QEMU_OPT_STRING,
-            .help = "discard operation (ignore/off, unmap/on)",
-        },{
-            .name = "cache",
-            .type = QEMU_OPT_STRING,
-            .help = "host cache usage (none, writeback, writethrough, "
-                    "directsync, unsafe)",
-        },{
-            .name = "aio",
-            .type = QEMU_OPT_STRING,
-            .help = "host AIO implementation (threads, native)",
-        },{
-            .name = "format",
-            .type = QEMU_OPT_STRING,
-            .help = "disk format (raw, qcow2, ...)",
-        },{
-            .name = "serial",
-            .type = QEMU_OPT_STRING,
-            .help = "disk serial number",
-        },{
-            .name = "rerror",
-            .type = QEMU_OPT_STRING,
-            .help = "read error action",
-        },{
-            .name = "werror",
-            .type = QEMU_OPT_STRING,
-            .help = "write error action",
-        },{
-            .name = "addr",
-            .type = QEMU_OPT_STRING,
-            .help = "pci address (virtio only)",
-        },{
-            .name = "readonly",
-            .type = QEMU_OPT_BOOL,
-            .help = "open drive file as read-only",
-        },{
-            .name = "iops",
-            .type = QEMU_OPT_NUMBER,
-            .help = "limit total I/O operations per second",
-        },{
-            .name = "iops_rd",
-            .type = QEMU_OPT_NUMBER,
-            .help = "limit read operations per second",
-        },{
-            .name = "iops_wr",
-            .type = QEMU_OPT_NUMBER,
-            .help = "limit write operations per second",
-        },{
-            .name = "bps",
-            .type = QEMU_OPT_NUMBER,
-            .help = "limit total bytes per second",
-        },{
-            .name = "bps_rd",
-            .type = QEMU_OPT_NUMBER,
-            .help = "limit read bytes per second",
-        },{
-            .name = "bps_wr",
-            .type = QEMU_OPT_NUMBER,
-            .help = "limit write bytes per second",
-        },{
-            .name = "copy-on-read",
-            .type = QEMU_OPT_BOOL,
-            .help = "copy read data from backing file into image file",
-        },{
-            .name = "boot",
-            .type = QEMU_OPT_BOOL,
-            .help = "(deprecated, ignored)",
-        },
+        /*
+         * no elements => accept any params
+         * validation will happen later
+         */
         { /* end of list */ }
     },
 };
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 387b050..f487a8f 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -57,7 +57,7 @@
 048 img auto quick
 049 rw auto
 050 rw auto backing quick
-#051 rw auto
+051 rw auto
 052 rw auto backing
 053 rw auto
 054 rw auto
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 2/5] qcow2: Add refcount update reason to all callers
  2013-06-13 11:47 [Qemu-devel] [PATCH 0/5] qcow2: Discard freed clusters Kevin Wolf
  2013-06-13 11:47 ` [Qemu-devel] [PATCH 1/5] Revert "block: Disable driver-specific options for 1.5" Kevin Wolf
@ 2013-06-13 11:47 ` Kevin Wolf
  2013-06-13 11:47 ` [Qemu-devel] [PATCH 3/5] qcow2: Options to enable discard for freed clusters Kevin Wolf
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Kevin Wolf @ 2013-06-13 11:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

This adds a refcount update reason to all callers of update_refcounts(),
so that a follow-up patch can use this information to decide whether
clusters that reach a refcount of 0 should be discarded in the image
file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-cluster.c  | 16 +++++++++------
 block/qcow2-refcount.c | 55 +++++++++++++++++++++++++++++++-------------------
 block/qcow2-snapshot.c |  6 ++++--
 block/qcow2.c          |  3 ++-
 block/qcow2.h          | 16 ++++++++++++---
 5 files changed, 63 insertions(+), 33 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 76f30e5..26de1c1 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -98,14 +98,16 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
         goto fail;
     }
     g_free(s->l1_table);
-    qcow2_free_clusters(bs, s->l1_table_offset, s->l1_size * sizeof(uint64_t));
+    qcow2_free_clusters(bs, s->l1_table_offset, s->l1_size * sizeof(uint64_t),
+                        QCOW2_DISCARD_OTHER);
     s->l1_table_offset = new_l1_table_offset;
     s->l1_table = new_l1_table;
     s->l1_size = new_l1_size;
     return 0;
  fail:
     g_free(new_l1_table);
-    qcow2_free_clusters(bs, new_l1_table_offset, new_l1_size2);
+    qcow2_free_clusters(bs, new_l1_table_offset, new_l1_size2,
+                        QCOW2_DISCARD_OTHER);
     return ret;
 }
 
@@ -548,7 +550,8 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
 
         /* Then decrease the refcount of the old table */
         if (l2_offset) {
-            qcow2_free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t));
+            qcow2_free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t),
+                                QCOW2_DISCARD_OTHER);
         }
     }
 
@@ -718,7 +721,8 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
      */
     if (j != 0) {
         for (i = 0; i < j; i++) {
-            qcow2_free_any_clusters(bs, be64_to_cpu(old_cluster[i]), 1);
+            qcow2_free_any_clusters(bs, be64_to_cpu(old_cluster[i]), 1,
+                                    QCOW2_DISCARD_OTHER);
         }
     }
 
@@ -1339,7 +1343,7 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
         l2_table[l2_index + i] = cpu_to_be64(0);
 
         /* Then decrease the refcount */
-        qcow2_free_any_clusters(bs, old_offset, 1);
+        qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST);
     }
 
     ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
@@ -1415,7 +1419,7 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
         qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
         if (old_offset & QCOW_OFLAG_COMPRESSED) {
             l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
-            qcow2_free_any_clusters(bs, old_offset, 1);
+            qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST);
         } else {
             l2_table[l2_index + i] |= cpu_to_be64(QCOW_OFLAG_ZERO);
         }
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index b32738f..6d35e49 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -29,7 +29,7 @@
 static int64_t alloc_clusters_noref(BlockDriverState *bs, int64_t size);
 static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
                             int64_t offset, int64_t length,
-                            int addend);
+                            int addend, enum qcow2_discard_type type);
 
 
 /*********************************************************/
@@ -235,7 +235,8 @@ static int alloc_refcount_block(BlockDriverState *bs,
     } else {
         /* Described somewhere else. This can recurse at most twice before we
          * arrive at a block that describes itself. */
-        ret = update_refcount(bs, new_block, s->cluster_size, 1);
+        ret = update_refcount(bs, new_block, s->cluster_size, 1,
+                              QCOW2_DISCARD_NEVER);
         if (ret < 0) {
             goto fail_block;
         }
@@ -399,7 +400,8 @@ static int alloc_refcount_block(BlockDriverState *bs,
 
     /* Free old table. Remember, we must not change free_cluster_index */
     uint64_t old_free_cluster_index = s->free_cluster_index;
-    qcow2_free_clusters(bs, old_table_offset, old_table_size * sizeof(uint64_t));
+    qcow2_free_clusters(bs, old_table_offset, old_table_size * sizeof(uint64_t),
+                        QCOW2_DISCARD_OTHER);
     s->free_cluster_index = old_free_cluster_index;
 
     ret = load_refcount_block(bs, new_block, (void**) refcount_block);
@@ -420,7 +422,7 @@ fail_block:
 
 /* XXX: cache several refcount block clusters ? */
 static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
-    int64_t offset, int64_t length, int addend)
+    int64_t offset, int64_t length, int addend, enum qcow2_discard_type type)
 {
     BDRVQcowState *s = bs->opaque;
     int64_t start, last, cluster_offset;
@@ -506,7 +508,8 @@ fail:
      */
     if (ret < 0) {
         int dummy;
-        dummy = update_refcount(bs, offset, cluster_offset - offset, -addend);
+        dummy = update_refcount(bs, offset, cluster_offset - offset, -addend,
+                                QCOW2_DISCARD_NEVER);
         (void)dummy;
     }
 
@@ -522,12 +525,14 @@ fail:
  */
 static int update_cluster_refcount(BlockDriverState *bs,
                                    int64_t cluster_index,
-                                   int addend)
+                                   int addend,
+                                   enum qcow2_discard_type type)
 {
     BDRVQcowState *s = bs->opaque;
     int ret;
 
-    ret = update_refcount(bs, cluster_index << s->cluster_bits, 1, addend);
+    ret = update_refcount(bs, cluster_index << s->cluster_bits, 1, addend,
+                          type);
     if (ret < 0) {
         return ret;
     }
@@ -579,7 +584,7 @@ int64_t qcow2_alloc_clusters(BlockDriverState *bs, int64_t size)
         return offset;
     }
 
-    ret = update_refcount(bs, offset, size, 1);
+    ret = update_refcount(bs, offset, size, 1, QCOW2_DISCARD_NEVER);
     if (ret < 0) {
         return ret;
     }
@@ -611,7 +616,8 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
     old_free_cluster_index = s->free_cluster_index;
     s->free_cluster_index = cluster_index + i;
 
-    ret = update_refcount(bs, offset, i << s->cluster_bits, 1);
+    ret = update_refcount(bs, offset, i << s->cluster_bits, 1,
+                          QCOW2_DISCARD_NEVER);
     if (ret < 0) {
         return ret;
     }
@@ -649,7 +655,8 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
         if (free_in_cluster == 0)
             s->free_byte_offset = 0;
         if ((offset & (s->cluster_size - 1)) != 0)
-            update_cluster_refcount(bs, offset >> s->cluster_bits, 1);
+            update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
+                                    QCOW2_DISCARD_NEVER);
     } else {
         offset = qcow2_alloc_clusters(bs, s->cluster_size);
         if (offset < 0) {
@@ -659,7 +666,8 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
         if ((cluster_offset + s->cluster_size) == offset) {
             /* we are lucky: contiguous data */
             offset = s->free_byte_offset;
-            update_cluster_refcount(bs, offset >> s->cluster_bits, 1);
+            update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
+                                    QCOW2_DISCARD_NEVER);
             s->free_byte_offset += size;
         } else {
             s->free_byte_offset = offset;
@@ -676,12 +684,13 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
 }
 
 void qcow2_free_clusters(BlockDriverState *bs,
-                          int64_t offset, int64_t size)
+                          int64_t offset, int64_t size,
+                          enum qcow2_discard_type type)
 {
     int ret;
 
     BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_FREE);
-    ret = update_refcount(bs, offset, size, -1);
+    ret = update_refcount(bs, offset, size, -1, type);
     if (ret < 0) {
         fprintf(stderr, "qcow2_free_clusters failed: %s\n", strerror(-ret));
         /* TODO Remember the clusters to free them later and avoid leaking */
@@ -692,8 +701,8 @@ void qcow2_free_clusters(BlockDriverState *bs,
  * Free a cluster using its L2 entry (handles clusters of all types, e.g.
  * normal cluster, compressed cluster, etc.)
  */
-void qcow2_free_any_clusters(BlockDriverState *bs,
-    uint64_t l2_entry, int nb_clusters)
+void qcow2_free_any_clusters(BlockDriverState *bs, uint64_t l2_entry,
+                             int nb_clusters, enum qcow2_discard_type type)
 {
     BDRVQcowState *s = bs->opaque;
 
@@ -705,12 +714,12 @@ void qcow2_free_any_clusters(BlockDriverState *bs,
                            s->csize_mask) + 1;
             qcow2_free_clusters(bs,
                 (l2_entry & s->cluster_offset_mask) & ~511,
-                nb_csectors * 512);
+                nb_csectors * 512, type);
         }
         break;
     case QCOW2_CLUSTER_NORMAL:
         qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
-                            nb_clusters << s->cluster_bits);
+                            nb_clusters << s->cluster_bits, type);
         break;
     case QCOW2_CLUSTER_UNALLOCATED:
     case QCOW2_CLUSTER_ZERO:
@@ -785,7 +794,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
                             int ret;
                             ret = update_refcount(bs,
                                 (offset & s->cluster_offset_mask) & ~511,
-                                nb_csectors * 512, addend);
+                                nb_csectors * 512, addend,
+                                QCOW2_DISCARD_SNAPSHOT);
                             if (ret < 0) {
                                 goto fail;
                             }
@@ -795,7 +805,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
                     } else {
                         uint64_t cluster_index = (offset & L2E_OFFSET_MASK) >> s->cluster_bits;
                         if (addend != 0) {
-                            refcount = update_cluster_refcount(bs, cluster_index, addend);
+                            refcount = update_cluster_refcount(bs, cluster_index, addend,
+                                                               QCOW2_DISCARD_SNAPSHOT);
                         } else {
                             refcount = get_refcount(bs, cluster_index);
                         }
@@ -827,7 +838,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 
 
             if (addend != 0) {
-                refcount = update_cluster_refcount(bs, l2_offset >> s->cluster_bits, addend);
+                refcount = update_cluster_refcount(bs, l2_offset >> s->cluster_bits, addend,
+                                                   QCOW2_DISCARD_SNAPSHOT);
             } else {
                 refcount = get_refcount(bs, l2_offset >> s->cluster_bits);
             }
@@ -1253,7 +1265,8 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
 
             if (num_fixed) {
                 ret = update_refcount(bs, i << s->cluster_bits, 1,
-                                      refcount2 - refcount1);
+                                      refcount2 - refcount1,
+                                      QCOW2_DISCARD_ALWAYS);
                 if (ret >= 0) {
                     (*num_fixed)++;
                     continue;
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 992a5c8..0caac90 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -262,7 +262,8 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
     }
 
     /* free the old snapshot table */
-    qcow2_free_clusters(bs, s->snapshots_offset, s->snapshots_size);
+    qcow2_free_clusters(bs, s->snapshots_offset, s->snapshots_size,
+                        QCOW2_DISCARD_SNAPSHOT);
     s->snapshots_offset = snapshots_offset;
     s->snapshots_size = snapshots_size;
     return 0;
@@ -569,7 +570,8 @@ int qcow2_snapshot_delete(BlockDriverState *bs, const char *snapshot_id)
     if (ret < 0) {
         return ret;
     }
-    qcow2_free_clusters(bs, sn.l1_table_offset, sn.l1_size * sizeof(uint64_t));
+    qcow2_free_clusters(bs, sn.l1_table_offset, sn.l1_size * sizeof(uint64_t),
+                        QCOW2_DISCARD_SNAPSHOT);
 
     /* must update the copied flag on the current cluster offsets */
     ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 0);
diff --git a/block/qcow2.c b/block/qcow2.c
index 0fa5cb2..e28ea47 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1196,7 +1196,8 @@ static int preallocate(BlockDriverState *bs)
 
         ret = qcow2_alloc_cluster_link_l2(bs, meta);
         if (ret < 0) {
-            qcow2_free_any_clusters(bs, meta->alloc_offset, meta->nb_clusters);
+            qcow2_free_any_clusters(bs, meta->alloc_offset, meta->nb_clusters,
+                                    QCOW2_DISCARD_NEVER);
             return ret;
         }
 
diff --git a/block/qcow2.h b/block/qcow2.h
index 6959c6a..64a6479 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -129,6 +129,15 @@ enum {
     QCOW2_COMPAT_FEAT_MASK            = QCOW2_COMPAT_LAZY_REFCOUNTS,
 };
 
+enum qcow2_discard_type {
+    QCOW2_DISCARD_NEVER = 0,
+    QCOW2_DISCARD_ALWAYS,
+    QCOW2_DISCARD_REQUEST,
+    QCOW2_DISCARD_SNAPSHOT,
+    QCOW2_DISCARD_OTHER,
+    QCOW2_DISCARD_MAX
+};
+
 typedef struct Qcow2Feature {
     uint8_t type;
     uint8_t bit;
@@ -349,9 +358,10 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
     int nb_clusters);
 int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size);
 void qcow2_free_clusters(BlockDriverState *bs,
-    int64_t offset, int64_t size);
-void qcow2_free_any_clusters(BlockDriverState *bs,
-    uint64_t cluster_offset, int nb_clusters);
+                          int64_t offset, int64_t size,
+                          enum qcow2_discard_type type);
+void qcow2_free_any_clusters(BlockDriverState *bs, uint64_t l2_entry,
+                             int nb_clusters, enum qcow2_discard_type type);
 
 int qcow2_update_snapshot_refcount(BlockDriverState *bs,
     int64_t l1_table_offset, int l1_size, int addend);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 3/5] qcow2: Options to enable discard for freed clusters
  2013-06-13 11:47 [Qemu-devel] [PATCH 0/5] qcow2: Discard freed clusters Kevin Wolf
  2013-06-13 11:47 ` [Qemu-devel] [PATCH 1/5] Revert "block: Disable driver-specific options for 1.5" Kevin Wolf
  2013-06-13 11:47 ` [Qemu-devel] [PATCH 2/5] qcow2: Add refcount update reason to all callers Kevin Wolf
@ 2013-06-13 11:47 ` Kevin Wolf
  2013-06-13 22:10   ` Paolo Bonzini
  2013-06-17 15:41   ` Stefan Hajnoczi
  2013-06-13 11:47 ` [Qemu-devel] [PATCH 4/5] qcow2: Batch discards Kevin Wolf
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 18+ messages in thread
From: Kevin Wolf @ 2013-06-13 11:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

Deleted snapshots are discarded in the image file by default, discard
requests take their default from the -drive discard=... option and other
places that free clusters must always be enabled explicitly.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-refcount.c |  5 +++++
 block/qcow2.c          | 26 ++++++++++++++++++++++++++
 block/qcow2.h          |  5 +++++
 3 files changed, 36 insertions(+)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 6d35e49..7488988 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -488,6 +488,11 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
             s->free_cluster_index = cluster_index;
         }
         refcount_block[block_index] = cpu_to_be16(refcount);
+        if (refcount == 0 && s->discard_passthrough[type]) {
+            /* Try discarding, ignore errors */
+            /* FIXME Doing this cluster by cluster will be painfully slow */
+            bdrv_discard(bs->file, cluster_offset, 1);
+        }
     }
 
     ret = 0;
diff --git a/block/qcow2.c b/block/qcow2.c
index e28ea47..62e6753 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -295,6 +295,22 @@ static QemuOptsList qcow2_runtime_opts = {
             .type = QEMU_OPT_BOOL,
             .help = "Postpone refcount updates",
         },
+        {
+            .name = QCOW2_OPT_DISCARD_REQUEST,
+            .type = QEMU_OPT_BOOL,
+            .help = "Pass guest discard requests to the layer below",
+        },
+        {
+            .name = QCOW2_OPT_DISCARD_SNAPSHOT,
+            .type = QEMU_OPT_BOOL,
+            .help = "Generate discard requests when snapshot related space "
+                    "is freed",
+        },
+        {
+            .name = QCOW2_OPT_DISCARD_OTHER,
+            .type = QEMU_OPT_BOOL,
+            .help = "Generate discard requests when other clusters are freed",
+        },
         { /* end of list */ }
     },
 };
@@ -532,6 +548,16 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags)
     s->use_lazy_refcounts = qemu_opt_get_bool(opts, QCOW2_OPT_LAZY_REFCOUNTS,
         (s->compatible_features & QCOW2_COMPAT_LAZY_REFCOUNTS));
 
+    s->discard_passthrough[QCOW2_DISCARD_NEVER] = false,
+    s->discard_passthrough[QCOW2_DISCARD_ALWAYS] = true,
+    s->discard_passthrough[QCOW2_DISCARD_REQUEST] =
+        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_REQUEST,
+                          flags & BDRV_O_UNMAP),
+    s->discard_passthrough[QCOW2_DISCARD_SNAPSHOT] =
+        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_SNAPSHOT, true),
+    s->discard_passthrough[QCOW2_DISCARD_OTHER] =
+        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_OTHER, false),
+
     qemu_opts_del(opts);
 
     if (s->use_lazy_refcounts && s->qcow_version < 3) {
diff --git a/block/qcow2.h b/block/qcow2.h
index 64a6479..6f91b9a 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -60,6 +60,9 @@
 
 
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy_refcounts"
+#define QCOW2_OPT_DISCARD_REQUEST "pass_discard_request"
+#define QCOW2_OPT_DISCARD_SNAPSHOT "pass_discard_snapshot"
+#define QCOW2_OPT_DISCARD_OTHER "pass_discard_other"
 
 typedef struct QCowHeader {
     uint32_t magic;
@@ -187,6 +190,8 @@ typedef struct BDRVQcowState {
     int qcow_version;
     bool use_lazy_refcounts;
 
+    bool discard_passthrough[QCOW2_DISCARD_MAX];
+
     uint64_t incompatible_features;
     uint64_t compatible_features;
     uint64_t autoclear_features;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 4/5] qcow2: Batch discards
  2013-06-13 11:47 [Qemu-devel] [PATCH 0/5] qcow2: Discard freed clusters Kevin Wolf
                   ` (2 preceding siblings ...)
  2013-06-13 11:47 ` [Qemu-devel] [PATCH 3/5] qcow2: Options to enable discard for freed clusters Kevin Wolf
@ 2013-06-13 11:47 ` Kevin Wolf
  2013-06-17 15:45   ` Stefan Hajnoczi
  2013-06-13 11:47 ` [Qemu-devel] [PATCH 5/5] block: Always enable discard on the protocol level Kevin Wolf
  2013-06-18  3:03 ` [Qemu-devel] [PATCH 0/5] qcow2: Discard freed clusters Wenchao Xia
  5 siblings, 1 reply; 18+ messages in thread
From: Kevin Wolf @ 2013-06-13 11:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

This optimises the discard operation for freed clusters by batching
discard requests (both snapshot deletion and bdrv_discard end up
updating the refcounts cluster by cluster).

Note that we don't discard asynchronously, but keep s->lock held. This
is to avoid that a freed cluster is reallocated and written to while the
discard is still in flight.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-cluster.c  | 22 ++++++++++---
 block/qcow2-refcount.c | 85 ++++++++++++++++++++++++++++++++++++++++++++++++--
 block/qcow2.c          |  1 +
 block/qcow2.h          | 11 +++++++
 4 files changed, 112 insertions(+), 7 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 26de1c1..e86bddc 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1374,18 +1374,25 @@ int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
 
     nb_clusters = size_to_clusters(s, end_offset - offset);
 
+    s->cache_discards = true;
+
     /* Each L2 table is handled by its own loop iteration */
     while (nb_clusters > 0) {
         ret = discard_single_l2(bs, offset, nb_clusters);
         if (ret < 0) {
-            return ret;
+            goto fail;
         }
 
         nb_clusters -= ret;
         offset += (ret * s->cluster_size);
     }
 
-    return 0;
+    ret = 0;
+fail:
+    s->cache_discards = false;
+    qcow2_process_discards(bs, ret);
+
+    return ret;
 }
 
 /*
@@ -1447,15 +1454,22 @@ int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors)
     /* Each L2 table is handled by its own loop iteration */
     nb_clusters = size_to_clusters(s, nb_sectors << BDRV_SECTOR_BITS);
 
+    s->cache_discards = true;
+
     while (nb_clusters > 0) {
         ret = zero_single_l2(bs, offset, nb_clusters);
         if (ret < 0) {
-            return ret;
+            goto fail;
         }
 
         nb_clusters -= ret;
         offset += (ret * s->cluster_size);
     }
 
-    return 0;
+    ret = 0;
+fail:
+    s->cache_discards = false;
+    qcow2_process_discards(bs, ret);
+
+    return ret;
 }
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 7488988..b5a4fb3 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -420,6 +420,77 @@ fail_block:
     return ret;
 }
 
+void qcow2_process_discards(BlockDriverState *bs, int ret)
+{
+    BDRVQcowState *s = bs->opaque;
+    Qcow2DiscardRegion *d, *next;
+
+    QTAILQ_FOREACH_SAFE(d, &s->discards, next, next) {
+        QTAILQ_REMOVE(&s->discards, d, next);
+
+        /* Discard is optional, ignore the return value */
+        if (ret >= 0) {
+            fprintf(stderr, "discard: %lx + %lx\n",
+                         d->offset >> BDRV_SECTOR_BITS,
+                         d->bytes >> BDRV_SECTOR_BITS);
+            bdrv_discard(bs->file,
+                         d->offset >> BDRV_SECTOR_BITS,
+                         d->bytes >> BDRV_SECTOR_BITS);
+        }
+
+        g_free(d);
+    }
+}
+
+static void update_refcount_discard(BlockDriverState *bs,
+                                    uint64_t offset, uint64_t length)
+{
+    BDRVQcowState *s = bs->opaque;
+    Qcow2DiscardRegion *d, *p, *next;
+
+    QTAILQ_FOREACH(d, &s->discards, next) {
+        uint64_t new_start = MIN(offset, d->offset);
+        uint64_t new_end = MAX(offset + length, d->offset + d->bytes);
+
+        if (new_end - new_start <= length + d->bytes) {
+            /* There can't be any overlap, areas ending up here have no
+             * references any more and therefore shouldn't get freed another
+             * time. */
+            assert(d->bytes + length == new_end - new_start);
+            d->offset = new_start;
+            d->bytes = new_end - new_start;
+            goto found;
+        }
+    }
+
+    d = g_malloc(sizeof(*d));
+    *d = (Qcow2DiscardRegion) {
+        .bs     = bs,
+        .offset = offset,
+        .bytes  = length,
+    };
+    QTAILQ_INSERT_TAIL(&s->discards, d, next);
+
+found:
+    /* Merge discard requests if they are adjacent now */
+    QTAILQ_FOREACH_SAFE(p, &s->discards, next, next) {
+        if (p == d
+            || p->offset > d->offset + d->bytes
+            || d->offset > p->offset + p->bytes)
+        {
+            continue;
+        }
+
+        /* Still no overlap possible */
+        assert(p->offset == d->offset + d->bytes
+            || d->offset == p->offset + p->bytes);
+
+        QTAILQ_REMOVE(&s->discards, p, next);
+        d->offset = MIN(d->offset, p->offset);
+        d->bytes += p->bytes;
+    }
+}
+
 /* XXX: cache several refcount block clusters ? */
 static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
     int64_t offset, int64_t length, int addend, enum qcow2_discard_type type)
@@ -488,15 +559,18 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
             s->free_cluster_index = cluster_index;
         }
         refcount_block[block_index] = cpu_to_be16(refcount);
+
         if (refcount == 0 && s->discard_passthrough[type]) {
-            /* Try discarding, ignore errors */
-            /* FIXME Doing this cluster by cluster will be painfully slow */
-            bdrv_discard(bs->file, cluster_offset, 1);
+            update_refcount_discard(bs, cluster_offset, s->cluster_size);
         }
     }
 
     ret = 0;
 fail:
+    if (!s->cache_discards) {
+        qcow2_process_discards(bs, ret);
+    }
+
     /* Write last changed block to disk */
     if (refcount_block) {
         int wret;
@@ -755,6 +829,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
     l1_table = NULL;
     l1_size2 = l1_size * sizeof(uint64_t);
 
+    s->cache_discards = true;
+
     /* WARNING: qcow2_snapshot_goto relies on this function not using the
      * l1_table_offset when it is the current s->l1_table_offset! Be careful
      * when changing this! */
@@ -867,6 +943,9 @@ fail:
         qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
     }
 
+    s->cache_discards = false;
+    qcow2_process_discards(bs, ret);
+
     /* Update L1 only if it isn't deleted anyway (addend = -1) */
     if (ret == 0 && addend >= 0 && l1_modified) {
         for (i = 0; i < l1_size; i++) {
diff --git a/block/qcow2.c b/block/qcow2.c
index 62e6753..de93d81 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -486,6 +486,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags)
     }
 
     QLIST_INIT(&s->cluster_allocs);
+    QTAILQ_INIT(&s->discards);
 
     /* read qcow2 extensions */
     if (qcow2_read_extensions(bs, header.header_length, ext_end, NULL)) {
diff --git a/block/qcow2.h b/block/qcow2.h
index 6f91b9a..3b2d5cd 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -147,6 +147,13 @@ typedef struct Qcow2Feature {
     char    name[46];
 } QEMU_PACKED Qcow2Feature;
 
+typedef struct Qcow2DiscardRegion {
+    BlockDriverState *bs;
+    uint64_t offset;
+    uint64_t bytes;
+    QTAILQ_ENTRY(Qcow2DiscardRegion) next;
+} Qcow2DiscardRegion;
+
 typedef struct BDRVQcowState {
     int cluster_bits;
     int cluster_size;
@@ -199,6 +206,8 @@ typedef struct BDRVQcowState {
     size_t unknown_header_fields_size;
     void* unknown_header_fields;
     QLIST_HEAD(, Qcow2UnknownHeaderExtension) unknown_header_ext;
+    QTAILQ_HEAD (, Qcow2DiscardRegion) discards;
+    bool cache_discards;
 } BDRVQcowState;
 
 /* XXX: use std qcow open function ? */
@@ -374,6 +383,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
                           BdrvCheckMode fix);
 
+void qcow2_process_discards(BlockDriverState *bs, int ret);
+
 /* qcow2-cluster.c functions */
 int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
                         bool exact_size);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 5/5] block: Always enable discard on the protocol level
  2013-06-13 11:47 [Qemu-devel] [PATCH 0/5] qcow2: Discard freed clusters Kevin Wolf
                   ` (3 preceding siblings ...)
  2013-06-13 11:47 ` [Qemu-devel] [PATCH 4/5] qcow2: Batch discards Kevin Wolf
@ 2013-06-13 11:47 ` Kevin Wolf
  2013-06-13 22:06   ` Paolo Bonzini
  2013-06-18  3:03 ` [Qemu-devel] [PATCH 0/5] qcow2: Discard freed clusters Wenchao Xia
  5 siblings, 1 reply; 18+ messages in thread
From: Kevin Wolf @ 2013-06-13 11:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

Turning on discard options in qcow2 doesn't help a lot when the discard
requests that it issues are thrown away by the raw-posix layer. This
patch always enables discard functionality on the protocol level so that
it's the image format's responsibility to send (or not) discard
requests. Requests sent by the guest will be allowed or ignored by the
top level BlockDriverState, which depends on the discard=... option like
before.

In particular, this means that even without specifying options, the
qcow2 default of discarding deleted snapshots actually takes effect now,
both for qemu and qemu-img.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 79ad33d..0a7cf2f 100644
--- a/block.c
+++ b/block.c
@@ -1045,7 +1045,7 @@ int bdrv_open(BlockDriverState *bs, const char *filename, QDict *options,
     extract_subqdict(options, &file_options, "file.");
 
     ret = bdrv_file_open(&file, filename, file_options,
-                         bdrv_open_flags(bs, flags));
+                         bdrv_open_flags(bs, flags | BDRV_O_UNMAP));
     if (ret < 0) {
         goto fail;
     }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] block: Always enable discard on the protocol level
  2013-06-13 11:47 ` [Qemu-devel] [PATCH 5/5] block: Always enable discard on the protocol level Kevin Wolf
@ 2013-06-13 22:06   ` Paolo Bonzini
  2013-06-14  8:14     ` Kevin Wolf
  0 siblings, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2013-06-13 22:06 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, stefanha

Il 13/06/2013 07:47, Kevin Wolf ha scritto:
> Turning on discard options in qcow2 doesn't help a lot when the discard
> requests that it issues are thrown away by the raw-posix layer. This
> patch always enables discard functionality on the protocol level so that
> it's the image format's responsibility to send (or not) discard
> requests. Requests sent by the guest will be allowed or ignored by the
> top level BlockDriverState, which depends on the discard=... option like
> before.
> 
> In particular, this means that even without specifying options, the
> qcow2 default of discarding deleted snapshots actually takes effect now,
> both for qemu and qemu-img.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block.c b/block.c
> index 79ad33d..0a7cf2f 100644
> --- a/block.c
> +++ b/block.c
> @@ -1045,7 +1045,7 @@ int bdrv_open(BlockDriverState *bs, const char *filename, QDict *options,
>      extract_subqdict(options, &file_options, "file.");
>  
>      ret = bdrv_file_open(&file, filename, file_options,
> -                         bdrv_open_flags(bs, flags));
> +                         bdrv_open_flags(bs, flags | BDRV_O_UNMAP));
>      if (ret < 0) {
>          goto fail;
>      }
> 

Can you still disable it with -drive file.discard=ignore?

Paolo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] qcow2: Options to enable discard for freed clusters
  2013-06-13 11:47 ` [Qemu-devel] [PATCH 3/5] qcow2: Options to enable discard for freed clusters Kevin Wolf
@ 2013-06-13 22:10   ` Paolo Bonzini
  2013-06-14  8:31     ` Kevin Wolf
  2013-06-17 15:41   ` Stefan Hajnoczi
  1 sibling, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2013-06-13 22:10 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, stefanha

Il 13/06/2013 07:47, Kevin Wolf ha scritto:
> +    s->discard_passthrough[QCOW2_DISCARD_NEVER] = false,
> +    s->discard_passthrough[QCOW2_DISCARD_ALWAYS] = true,
> +    s->discard_passthrough[QCOW2_DISCARD_REQUEST] =
> +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_REQUEST,
> +                          flags & BDRV_O_UNMAP),

I think there should not be two ways to enable it, it is confusing.

> +    s->discard_passthrough[QCOW2_DISCARD_SNAPSHOT] =
> +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_SNAPSHOT, true),
> +    s->discard_passthrough[QCOW2_DISCARD_OTHER] =
> +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_OTHER, false),

Please document the defaults in qcow2_runtime_opts.  (BTW, what is the
rationale?)

Paolo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] block: Always enable discard on the protocol level
  2013-06-13 22:06   ` Paolo Bonzini
@ 2013-06-14  8:14     ` Kevin Wolf
  0 siblings, 0 replies; 18+ messages in thread
From: Kevin Wolf @ 2013-06-14  8:14 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, stefanha

Am 14.06.2013 um 00:06 hat Paolo Bonzini geschrieben:
> Il 13/06/2013 07:47, Kevin Wolf ha scritto:
> > Turning on discard options in qcow2 doesn't help a lot when the discard
> > requests that it issues are thrown away by the raw-posix layer. This
> > patch always enables discard functionality on the protocol level so that
> > it's the image format's responsibility to send (or not) discard
> > requests. Requests sent by the guest will be allowed or ignored by the
> > top level BlockDriverState, which depends on the discard=... option like
> > before.
> > 
> > In particular, this means that even without specifying options, the
> > qcow2 default of discarding deleted snapshots actually takes effect now,
> > both for qemu and qemu-img.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  block.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/block.c b/block.c
> > index 79ad33d..0a7cf2f 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -1045,7 +1045,7 @@ int bdrv_open(BlockDriverState *bs, const char *filename, QDict *options,
> >      extract_subqdict(options, &file_options, "file.");
> >  
> >      ret = bdrv_file_open(&file, filename, file_options,
> > -                         bdrv_open_flags(bs, flags));
> > +                         bdrv_open_flags(bs, flags | BDRV_O_UNMAP));
> >      if (ret < 0) {
> >          goto fail;
> >      }
> > 
> 
> Can you still disable it with -drive file.discard=ignore?

This requires a few more changes (basically moving BDRV_O_UNMAP from
flags into a boolean in the QDict), but I think eventually we'll want to
allow this.

Kevin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] qcow2: Options to enable discard for freed clusters
  2013-06-13 22:10   ` Paolo Bonzini
@ 2013-06-14  8:31     ` Kevin Wolf
  2013-06-14 14:16       ` Paolo Bonzini
  0 siblings, 1 reply; 18+ messages in thread
From: Kevin Wolf @ 2013-06-14  8:31 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, stefanha

Am 14.06.2013 um 00:10 hat Paolo Bonzini geschrieben:
> Il 13/06/2013 07:47, Kevin Wolf ha scritto:
> > +    s->discard_passthrough[QCOW2_DISCARD_NEVER] = false,
> > +    s->discard_passthrough[QCOW2_DISCARD_ALWAYS] = true,
> > +    s->discard_passthrough[QCOW2_DISCARD_REQUEST] =
> > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_REQUEST,
> > +                          flags & BDRV_O_UNMAP),
> 
> I think there should not be two ways to enable it, it is confusing.

Hm, yes... But it's also confusing to have qcow2 provide an incomplete
set of categories. Maybe we shouldn't have introduced -drive discard=...
as a global option to begin with.

> > +    s->discard_passthrough[QCOW2_DISCARD_SNAPSHOT] =
> > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_SNAPSHOT, true),
> > +    s->discard_passthrough[QCOW2_DISCARD_OTHER] =
> > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_OTHER, false),
> 
> Please document the defaults in qcow2_runtime_opts.  (BTW, what is the
> rationale?)

The idea was that discard is slow and therefore disabled by default,
except when you're doing an expensive snapshot operation that can
potentially free a lot of space at once with not too many requests, so
there it's enabled. And if you said -drive discard=on, you obviously
want guest requests to take effect.

We could let QCOW2_OPT_DISCARD_OTHER default to BDRV_O_UNMAP as well if
you prefer.

Kevin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] qcow2: Options to enable discard for freed clusters
  2013-06-14  8:31     ` Kevin Wolf
@ 2013-06-14 14:16       ` Paolo Bonzini
  2013-06-14 14:31         ` Kevin Wolf
  0 siblings, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2013-06-14 14:16 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, stefanha

Il 14/06/2013 04:31, Kevin Wolf ha scritto:
>>> > > +    s->discard_passthrough[QCOW2_DISCARD_NEVER] = false,
>>> > > +    s->discard_passthrough[QCOW2_DISCARD_ALWAYS] = true,
>>> > > +    s->discard_passthrough[QCOW2_DISCARD_REQUEST] =
>>> > > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_REQUEST,
>>> > > +                          flags & BDRV_O_UNMAP),
>> > 
>> > I think there should not be two ways to enable it, it is confusing.
> Hm, yes... But it's also confusing to have qcow2 provide an incomplete
> set of categories. Maybe we shouldn't have introduced -drive discard=...
> as a global option to begin with.
> 
>>> > > +    s->discard_passthrough[QCOW2_DISCARD_SNAPSHOT] =
>>> > > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_SNAPSHOT, true),
>>> > > +    s->discard_passthrough[QCOW2_DISCARD_OTHER] =
>>> > > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_OTHER, false),
>> > 
>> > Please document the defaults in qcow2_runtime_opts.  (BTW, what is the
>> > rationale?)
> The idea was that discard is slow and therefore disabled by default,
> except when you're doing an expensive snapshot operation that can
> potentially free a lot of space at once with not too many requests, so
> there it's enabled. And if you said -drive discard=on, you obviously
> want guest requests to take effect.
> 
> We could let QCOW2_OPT_DISCARD_OTHER default to BDRV_O_UNMAP as well if
> you prefer.

It looks like QCOW2_OPT_DISCARD_OTHER is a rare case, so I don't mind
leaving it as default to false.  It won't waste more than a few clusters.

In the end discard_snapshot and discard_other should rarely be needed in
practice, so I don't think having discard=... is a mistake.  Too many
knobs won't really be needed.

In fact, perhaps we do not need discard_snapshot and discard_request,
only discard_other.  discard_snapshot can be replaced by
file.discard=ignore, discard_request by discard=unmap.

Paolo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] qcow2: Options to enable discard for freed clusters
  2013-06-14 14:16       ` Paolo Bonzini
@ 2013-06-14 14:31         ` Kevin Wolf
  2013-06-14 15:00           ` Paolo Bonzini
  0 siblings, 1 reply; 18+ messages in thread
From: Kevin Wolf @ 2013-06-14 14:31 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, stefanha

Am 14.06.2013 um 16:16 hat Paolo Bonzini geschrieben:
> Il 14/06/2013 04:31, Kevin Wolf ha scritto:
> >>> > > +    s->discard_passthrough[QCOW2_DISCARD_NEVER] = false,
> >>> > > +    s->discard_passthrough[QCOW2_DISCARD_ALWAYS] = true,
> >>> > > +    s->discard_passthrough[QCOW2_DISCARD_REQUEST] =
> >>> > > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_REQUEST,
> >>> > > +                          flags & BDRV_O_UNMAP),
> >> > 
> >> > I think there should not be two ways to enable it, it is confusing.
> > Hm, yes... But it's also confusing to have qcow2 provide an incomplete
> > set of categories. Maybe we shouldn't have introduced -drive discard=...
> > as a global option to begin with.
> > 
> >>> > > +    s->discard_passthrough[QCOW2_DISCARD_SNAPSHOT] =
> >>> > > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_SNAPSHOT, true),
> >>> > > +    s->discard_passthrough[QCOW2_DISCARD_OTHER] =
> >>> > > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_OTHER, false),
> >> > 
> >> > Please document the defaults in qcow2_runtime_opts.  (BTW, what is the
> >> > rationale?)
> > The idea was that discard is slow and therefore disabled by default,
> > except when you're doing an expensive snapshot operation that can
> > potentially free a lot of space at once with not too many requests, so
> > there it's enabled. And if you said -drive discard=on, you obviously
> > want guest requests to take effect.
> > 
> > We could let QCOW2_OPT_DISCARD_OTHER default to BDRV_O_UNMAP as well if
> > you prefer.
> 
> It looks like QCOW2_OPT_DISCARD_OTHER is a rare case, so I don't mind
> leaving it as default to false.  It won't waste more than a few clusters.

Yes, it's generally relatively rare, like growing L1 or refcount table.
There is one case where it should trigger a lot, though: Overwriting
clusters of a compressed image.

Hm, though actually it doesn't make a lot of sense there. The freed
cluster will immediately be used by the next write. Maybe COW should
actually be QCOW2_OPT_DISCARD_NEVER...

> In the end discard_snapshot and discard_other should rarely be needed in
> practice, so I don't think having discard=... is a mistake.  Too many
> knobs won't really be needed.
> 
> In fact, perhaps we do not need discard_snapshot and discard_request,
> only discard_other.  discard_snapshot can be replaced by
> file.discard=ignore, discard_request by discard=unmap.

This is only true if you rule out some combination as useless. For
example you would say that if you want to process guest requests, you
always want to have snapshots discarded as well. You also assume that
nobody wants the current behaviour (free clusters in qcow2 metadata, but
don't send discards to raw-posix).

Isn't this assuming a bit too much?

To be clear, I don't expect these knobs to be used much either, but I
have some feeling that some people (including us while debugging or
asking questions) may be glad later to have such low-level options that
control each layer separately.

Kevin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] qcow2: Options to enable discard for freed clusters
  2013-06-14 14:31         ` Kevin Wolf
@ 2013-06-14 15:00           ` Paolo Bonzini
  0 siblings, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2013-06-14 15:00 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, stefanha

Il 14/06/2013 10:31, Kevin Wolf ha scritto:
>> It looks like QCOW2_OPT_DISCARD_OTHER is a rare case, so I don't mind
>> leaving it as default to false.  It won't waste more than a few clusters.
> 
> Yes, it's generally relatively rare, like growing L1 or refcount table.
> There is one case where it should trigger a lot, though: Overwriting
> clusters of a compressed image.
> 
> Hm, though actually it doesn't make a lot of sense there. The freed
> cluster will immediately be used by the next write. Maybe COW should
> actually be QCOW2_OPT_DISCARD_NEVER...

Sounds reasonable.

>> In the end discard_snapshot and discard_other should rarely be needed in
>> practice, so I don't think having discard=... is a mistake.  Too many
>> knobs won't really be needed.
>>
>> In fact, perhaps we do not need discard_snapshot and discard_request,
>> only discard_other.  discard_snapshot can be replaced by
>> file.discard=ignore, discard_request by discard=unmap.
> 
> This is only true if you rule out some combination as useless. For
> example you would say that if you want to process guest requests, you
> always want to have snapshots discarded as well. You also assume that
> nobody wants the current behaviour (free clusters in qcow2 metadata, but
> don't send discards to raw-posix).
> 
> Isn't this assuming a bit too much?
> 
> To be clear, I don't expect these knobs to be used much either, but I
> have some feeling that some people (including us while debugging or
> asking questions) may be glad later to have such low-level options that
> control each layer separately.

Yeah, you're right.  They're definitely useful to have, even if it is
"just in case".

Paolo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] qcow2: Options to enable discard for freed clusters
  2013-06-13 11:47 ` [Qemu-devel] [PATCH 3/5] qcow2: Options to enable discard for freed clusters Kevin Wolf
  2013-06-13 22:10   ` Paolo Bonzini
@ 2013-06-17 15:41   ` Stefan Hajnoczi
  2013-06-17 15:58     ` Kevin Wolf
  1 sibling, 1 reply; 18+ messages in thread
From: Stefan Hajnoczi @ 2013-06-17 15:41 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, stefanha

On Thu, Jun 13, 2013 at 01:47:41PM +0200, Kevin Wolf wrote:
> @@ -532,6 +548,16 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags)
>      s->use_lazy_refcounts = qemu_opt_get_bool(opts, QCOW2_OPT_LAZY_REFCOUNTS,
>          (s->compatible_features & QCOW2_COMPAT_LAZY_REFCOUNTS));
>  
> +    s->discard_passthrough[QCOW2_DISCARD_NEVER] = false,

comma instead of semicolon?

> @@ -187,6 +190,8 @@ typedef struct BDRVQcowState {
>      int qcow_version;
>      bool use_lazy_refcounts;
>  
> +    bool discard_passthrough[QCOW2_DISCARD_MAX];
> +

Neat solution to specifying discard behavior.

Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] qcow2: Batch discards
  2013-06-13 11:47 ` [Qemu-devel] [PATCH 4/5] qcow2: Batch discards Kevin Wolf
@ 2013-06-17 15:45   ` Stefan Hajnoczi
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Hajnoczi @ 2013-06-17 15:45 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, stefanha

On Thu, Jun 13, 2013 at 01:47:42PM +0200, Kevin Wolf wrote:
> @@ -420,6 +420,77 @@ fail_block:
>      return ret;
>  }
>  
> +void qcow2_process_discards(BlockDriverState *bs, int ret)
> +{
> +    BDRVQcowState *s = bs->opaque;
> +    Qcow2DiscardRegion *d, *next;
> +
> +    QTAILQ_FOREACH_SAFE(d, &s->discards, next, next) {
> +        QTAILQ_REMOVE(&s->discards, d, next);
> +
> +        /* Discard is optional, ignore the return value */
> +        if (ret >= 0) {
> +            fprintf(stderr, "discard: %lx + %lx\n",
> +                         d->offset >> BDRV_SECTOR_BITS,
> +                         d->bytes >> BDRV_SECTOR_BITS);

Debug code.  This doesn't happen with --enable-trace-backend=stderr ;).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] qcow2: Options to enable discard for freed clusters
  2013-06-17 15:41   ` Stefan Hajnoczi
@ 2013-06-17 15:58     ` Kevin Wolf
  0 siblings, 0 replies; 18+ messages in thread
From: Kevin Wolf @ 2013-06-17 15:58 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, stefanha

Am 17.06.2013 um 17:41 hat Stefan Hajnoczi geschrieben:
> On Thu, Jun 13, 2013 at 01:47:41PM +0200, Kevin Wolf wrote:
> > @@ -532,6 +548,16 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags)
> >      s->use_lazy_refcounts = qemu_opt_get_bool(opts, QCOW2_OPT_LAZY_REFCOUNTS,
> >          (s->compatible_features & QCOW2_COMPAT_LAZY_REFCOUNTS));
> >  
> > +    s->discard_passthrough[QCOW2_DISCARD_NEVER] = false,
> 
> comma instead of semicolon?

Oh. Isn't C a nice language? :-)

I'll fix that.

Kevin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 0/5] qcow2: Discard freed clusters
  2013-06-13 11:47 [Qemu-devel] [PATCH 0/5] qcow2: Discard freed clusters Kevin Wolf
                   ` (4 preceding siblings ...)
  2013-06-13 11:47 ` [Qemu-devel] [PATCH 5/5] block: Always enable discard on the protocol level Kevin Wolf
@ 2013-06-18  3:03 ` Wenchao Xia
  2013-06-18  6:25   ` Kevin Wolf
  5 siblings, 1 reply; 18+ messages in thread
From: Wenchao Xia @ 2013-06-18  3:03 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, stefanha

Hi, Kevin
  In normal case, qcow2 is based on raw-posix file which don't support
discard operation, so this serial doesn't change much in this case,
still real space is not freed in snapshot delete, and it completes
fast. Instead, tt affects the case that some protocol below support
discard, so delete will be more clean but slower in that case. Is this
true? Maybe add this in commit message?

> This series adds options to make qcow2 discard freed clusters, in several
> categories. By default, only freed clusters related to snapshots (i.e. mainly
> snapshot deletion) are discarded.
> 
> Kevin Wolf (5):
>    Revert "block: Disable driver-specific options for 1.5"
>    qcow2: Add refcount update reason to all callers
>    qcow2: Options to enable discard for freed clusters
>    qcow2: Batch discards
>    block: Always enable discard on the protocol level
> 
>   block.c                  |   2 +-
>   block/qcow2-cluster.c    |  38 +++++++++----
>   block/qcow2-refcount.c   | 139 ++++++++++++++++++++++++++++++++++++++++-------
>   block/qcow2-snapshot.c   |   6 +-
>   block/qcow2.c            |  30 +++++++++-
>   block/qcow2.h            |  32 ++++++++++-
>   blockdev.c               | 118 ++--------------------------------------
>   tests/qemu-iotests/group |   2 +-
>   8 files changed, 214 insertions(+), 153 deletions(-)
> 


-- 
Best Regards

Wenchao Xia

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 0/5] qcow2: Discard freed clusters
  2013-06-18  3:03 ` [Qemu-devel] [PATCH 0/5] qcow2: Discard freed clusters Wenchao Xia
@ 2013-06-18  6:25   ` Kevin Wolf
  0 siblings, 0 replies; 18+ messages in thread
From: Kevin Wolf @ 2013-06-18  6:25 UTC (permalink / raw)
  To: Wenchao Xia; +Cc: qemu-devel, stefanha

Am 18.06.2013 um 05:03 hat Wenchao Xia geschrieben:
> Hi, Kevin
>   In normal case, qcow2 is based on raw-posix file which don't support
> discard operation

It does, see raw_aio_discard().

> , so this serial doesn't change much in this case,
> still real space is not freed in snapshot delete, and it completes
> fast. Instead, tt affects the case that some protocol below support
> discard, so delete will be more clean but slower in that case. Is this
> true? Maybe add this in commit message?

Yes, it becomes a little bit slower, but not much in my tests.

Kevin

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-06-18  6:25 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-13 11:47 [Qemu-devel] [PATCH 0/5] qcow2: Discard freed clusters Kevin Wolf
2013-06-13 11:47 ` [Qemu-devel] [PATCH 1/5] Revert "block: Disable driver-specific options for 1.5" Kevin Wolf
2013-06-13 11:47 ` [Qemu-devel] [PATCH 2/5] qcow2: Add refcount update reason to all callers Kevin Wolf
2013-06-13 11:47 ` [Qemu-devel] [PATCH 3/5] qcow2: Options to enable discard for freed clusters Kevin Wolf
2013-06-13 22:10   ` Paolo Bonzini
2013-06-14  8:31     ` Kevin Wolf
2013-06-14 14:16       ` Paolo Bonzini
2013-06-14 14:31         ` Kevin Wolf
2013-06-14 15:00           ` Paolo Bonzini
2013-06-17 15:41   ` Stefan Hajnoczi
2013-06-17 15:58     ` Kevin Wolf
2013-06-13 11:47 ` [Qemu-devel] [PATCH 4/5] qcow2: Batch discards Kevin Wolf
2013-06-17 15:45   ` Stefan Hajnoczi
2013-06-13 11:47 ` [Qemu-devel] [PATCH 5/5] block: Always enable discard on the protocol level Kevin Wolf
2013-06-13 22:06   ` Paolo Bonzini
2013-06-14  8:14     ` Kevin Wolf
2013-06-18  3:03 ` [Qemu-devel] [PATCH 0/5] qcow2: Discard freed clusters Wenchao Xia
2013-06-18  6:25   ` Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.