All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 1/1] qcow2: handle cluster leak happening with a guest TRIM command
@ 2017-05-21 14:21 Denis V. Lunev
  2017-05-22 11:37 ` Eric Blake
  0 siblings, 1 reply; 5+ messages in thread
From: Denis V. Lunev @ 2017-05-21 14:21 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: Denis V. Lunev, Kevin Wolf, Max Reitz

  qemu-img create -f qcow2 1.img 64G
  qemu-io -c "write -P 0x32 0 64k" 1.img
results in
  324 -rw-r--r-- 1 den den 393216 May 21 16:48 1.img
Subsequent
  qemu-io -c "write -z 0 64k" 1.img
  qemu-io -c "write -P 0x32 0 64k" 1.img
results in
  388 -rw-r--r-- 1 den den 458752 May 21 16:50 1.img
which looks like we have 1 cluster leaked.

Indeed, qcow2_co_pwrite_zeroes calls qcow2_zero_clusters/zero_single_l2,
which does not update refcount for the host cluster and keep the offset
as used. Later on handle_copied() does not take into account
QCOW2_CLUSTER_ZERO type of the cluster.

For now we comes into a very bad situation after qcow2_zero_cluster.
We have a hole in the host file and we have the offset allocated for
that guest cluster. Now there are 2 options:
1) allocate new offset once the write will come into this guest
   cluster (actually happens, but original cluster offset is leaked)
2) re-use host offset, i.e. fix handle_copied() to allow to reuse
   offset not only for QCOW2_CLUSTER_NORMAL but for QCOW2_CLUSTER_ZERO
   too
Option 2) seems worse to me as in this case we can easily have host
fragmentation in that cluster if writes will come in small pieces.

This is not a very big deal if we have filesystem with PUCH_HOLE support,
but without this feature the cluster is actually leaked forever.

The patch replaces zero_single_l2 with discard_single_l2 and removes
now unused zero_single_l2 to fix the situation.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-cluster.c | 46 ++--------------------------------------------
 1 file changed, 2 insertions(+), 44 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 100398c..1e53a7c 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1548,49 +1548,6 @@ fail:
     return ret;
 }
 
-/*
- * This zeroes as many clusters of nb_clusters as possible at once (i.e.
- * all clusters in the same L2 table) and returns the number of zeroed
- * clusters.
- */
-static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
-                          uint64_t nb_clusters, int flags)
-{
-    BDRVQcow2State *s = bs->opaque;
-    uint64_t *l2_table;
-    int l2_index;
-    int ret;
-    int i;
-
-    ret = get_cluster_table(bs, offset, &l2_table, &l2_index);
-    if (ret < 0) {
-        return ret;
-    }
-
-    /* Limit nb_clusters to one L2 table */
-    nb_clusters = MIN(nb_clusters, s->l2_size - l2_index);
-    assert(nb_clusters <= INT_MAX);
-
-    for (i = 0; i < nb_clusters; i++) {
-        uint64_t old_offset;
-
-        old_offset = be64_to_cpu(l2_table[l2_index + i]);
-
-        /* Update L2 entries */
-        qcow2_cache_entry_mark_dirty(bs, s->l2_table_cache, l2_table);
-        if (old_offset & QCOW_OFLAG_COMPRESSED || flags & BDRV_REQ_MAY_UNMAP) {
-            l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
-            qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST);
-        } else {
-            l2_table[l2_index + i] |= cpu_to_be64(QCOW_OFLAG_ZERO);
-        }
-    }
-
-    qcow2_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
-
-    return nb_clusters;
-}
-
 int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors,
                         int flags)
 {
@@ -1609,7 +1566,8 @@ int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors,
     s->cache_discards = true;
 
     while (nb_clusters > 0) {
-        ret = zero_single_l2(bs, offset, nb_clusters, flags);
+        ret = discard_single_l2(bs, offset, nb_clusters,
+                                QCOW2_DISCARD_REQUEST, false);
         if (ret < 0) {
             goto fail;
         }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-05-23 14:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-21 14:21 [Qemu-devel] [PATCH 1/1] qcow2: handle cluster leak happening with a guest TRIM command Denis V. Lunev
2017-05-22 11:37 ` Eric Blake
2017-05-22 11:56   ` Denis V. Lunev
2017-05-22 13:02     ` Eric Blake
2017-05-23 14:52       ` Denis V. Lunev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.