All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] qcow2: Fix image corruption bug in 4.1
@ 2019-10-23 15:26 Kevin Wolf
  2019-10-23 15:26 ` [PATCH 1/3] coroutine: Add qemu_co_mutex_assert_locked() Kevin Wolf
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Kevin Wolf @ 2019-10-23 15:26 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, psyhomb, michael, vsementsov, den, qemu-devel,
	qemu-stable, dgilbert, mreitz, lersek

This series fixes an image corruption bug that was introduced in commit
69f47505e ('block: avoid recursive block_status call if possible'),
first contained in the QEMU 4.1.0 release.

This bug was reported by Michael Weiser on Launchpad:
https://bugs.launchpad.net/qemu/+bug/1846427

Kevin Wolf (3):
  coroutine: Add qemu_co_mutex_assert_locked()
  qcow2: Assert that qcow2_cache_get() callers hold s->lock
  qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()

 include/qemu/coroutine.h | 7 +++++++
 block/qcow2-cache.c      | 5 +++++
 block/qcow2-refcount.c   | 2 ++
 block/qcow2.c            | 3 ++-
 4 files changed, 16 insertions(+), 1 deletion(-)

-- 
2.20.1



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/3] coroutine: Add qemu_co_mutex_assert_locked()
  2019-10-23 15:26 [PATCH 0/3] qcow2: Fix image corruption bug in 4.1 Kevin Wolf
@ 2019-10-23 15:26 ` Kevin Wolf
  2019-10-24  9:59   ` Denis Lunev
  2019-10-23 15:26 ` [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock Kevin Wolf
  2019-10-23 15:26 ` [PATCH 3/3] qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation() Kevin Wolf
  2 siblings, 1 reply; 19+ messages in thread
From: Kevin Wolf @ 2019-10-23 15:26 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, psyhomb, michael, vsementsov, den, qemu-devel,
	qemu-stable, dgilbert, mreitz, lersek

Some functions require that the caller holds a certain CoMutex for them
to operate correctly. Add a function so that they can assert the lock is
really held.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 include/qemu/coroutine.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index 9801e7f5a4..a36bcfe5c8 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -167,6 +167,13 @@ void coroutine_fn qemu_co_mutex_lock(CoMutex *mutex);
  */
 void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex);
 
+/**
+ * Assert that the current coroutine holds @mutex.
+ */
+static inline coroutine_fn void qemu_co_mutex_assert_locked(CoMutex *mutex)
+{
+    assert(mutex->locked && mutex->holder == qemu_coroutine_self());
+}
 
 /**
  * CoQueues are a mechanism to queue coroutines in order to continue executing
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock
  2019-10-23 15:26 [PATCH 0/3] qcow2: Fix image corruption bug in 4.1 Kevin Wolf
  2019-10-23 15:26 ` [PATCH 1/3] coroutine: Add qemu_co_mutex_assert_locked() Kevin Wolf
@ 2019-10-23 15:26 ` Kevin Wolf
  2019-10-23 15:37   ` Kevin Wolf
  2019-10-24 10:01   ` Denis Lunev
  2019-10-23 15:26 ` [PATCH 3/3] qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation() Kevin Wolf
  2 siblings, 2 replies; 19+ messages in thread
From: Kevin Wolf @ 2019-10-23 15:26 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, psyhomb, michael, vsementsov, den, qemu-devel,
	qemu-stable, dgilbert, mreitz, lersek

qcow2_cache_do_get() requires that s->lock is locked because it can
yield between picking a cache entry and actually taking ownership of it
by setting offset and increasing the reference count.

Add an assertion to make sure the caller really holds the lock. The
function can be called outside of coroutine context, where bdrv_pread
and flushes become synchronous operations. The lock cannot and need not
be taken in this case.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-cache.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index d29b038a67..75b13dad99 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -327,6 +327,9 @@ static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
     int min_lru_index = -1;
 
     assert(offset != 0);
+    if (qemu_in_coroutine()) {
+        qemu_co_mutex_assert_locked(&s->lock);
+    }
 
     trace_qcow2_cache_get(qemu_coroutine_self(), c == s->l2_table_cache,
                           offset, read_from_disk);
@@ -386,6 +389,8 @@ static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
         }
     }
 
+    assert(c->entries[i].ref == 0);
+    assert(c->entries[i].offset == 0);
     c->entries[i].offset = offset;
 
     /* And return the right table */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/3] qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()
  2019-10-23 15:26 [PATCH 0/3] qcow2: Fix image corruption bug in 4.1 Kevin Wolf
  2019-10-23 15:26 ` [PATCH 1/3] coroutine: Add qemu_co_mutex_assert_locked() Kevin Wolf
  2019-10-23 15:26 ` [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock Kevin Wolf
@ 2019-10-23 15:26 ` Kevin Wolf
  2019-10-24 10:46   ` Vladimir Sementsov-Ogievskiy
  2 siblings, 1 reply; 19+ messages in thread
From: Kevin Wolf @ 2019-10-23 15:26 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, psyhomb, michael, vsementsov, den, qemu-devel,
	qemu-stable, dgilbert, mreitz, lersek

qcow2_detect_metadata_preallocation() calls qcow2_get_refcount() which
requires s->lock to be taken to protect its accesses to the refcount
table and refcount blocks. However, nothing in this code path actually
took the lock. This could cause the same cache entry to be used by two
requests at the same time, for different tables at different offsets,
resulting in image corruption.

As it would be preferable to base the detection on consistent data (even
though it's just heuristics), let's take the lock not only around the
qcow2_get_refcount() calls, but around the whole function.

This patch takes the lock in qcow2_co_block_status() earlier and asserts
in qcow2_detect_metadata_preallocation() that we hold the lock.

Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
Cc: qemu-stable@nongnu.org
Reported-by: Michael Weiser <michael@weiser.dinsnail.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-refcount.c | 2 ++
 block/qcow2.c          | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index ef965d7895..0d64bf5a5e 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -3455,6 +3455,8 @@ int qcow2_detect_metadata_preallocation(BlockDriverState *bs)
     int64_t i, end_cluster, cluster_count = 0, threshold;
     int64_t file_length, real_allocation, real_clusters;
 
+    qemu_co_mutex_assert_locked(&s->lock);
+
     file_length = bdrv_getlength(bs->file->bs);
     if (file_length < 0) {
         return file_length;
diff --git a/block/qcow2.c b/block/qcow2.c
index 8b05933565..0bc69e6996 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1916,6 +1916,8 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
     unsigned int bytes;
     int status = 0;
 
+    qemu_co_mutex_lock(&s->lock);
+
     if (!s->metadata_preallocation_checked) {
         ret = qcow2_detect_metadata_preallocation(bs);
         s->metadata_preallocation = (ret == 1);
@@ -1923,7 +1925,6 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
     }
 
     bytes = MIN(INT_MAX, count);
-    qemu_co_mutex_lock(&s->lock);
     ret = qcow2_get_cluster_offset(bs, offset, &bytes, &cluster_offset);
     qemu_co_mutex_unlock(&s->lock);
     if (ret < 0) {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock
  2019-10-23 15:26 ` [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock Kevin Wolf
@ 2019-10-23 15:37   ` Kevin Wolf
  2019-10-25 10:35     ` Michael Weiser
  2019-10-24 10:01   ` Denis Lunev
  1 sibling, 1 reply; 19+ messages in thread
From: Kevin Wolf @ 2019-10-23 15:37 UTC (permalink / raw)
  To: qemu-block
  Cc: psyhomb, michael, vsementsov, den, qemu-devel, qemu-stable,
	dgilbert, mreitz, lersek

Am 23.10.2019 um 17:26 hat Kevin Wolf geschrieben:
> qcow2_cache_do_get() requires that s->lock is locked because it can
> yield between picking a cache entry and actually taking ownership of it
> by setting offset and increasing the reference count.
> 
> Add an assertion to make sure the caller really holds the lock. The
> function can be called outside of coroutine context, where bdrv_pread
> and flushes become synchronous operations. The lock cannot and need not
> be taken in this case.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Oops, this one was a bit too optimistic. :-)

I'm still running tests to see if any other code paths trigger the
assertion, but image creation calls this without the lock held (which is
harmless because nobody else knows about the image so there won't be
concurrent requests). The following patch is needed additionally to make
image creation work with the new assertion.

Kevin


diff --git a/block/qcow2.c b/block/qcow2.c
index 0bc69e6996..7761cf3e07 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3213,6 +3213,7 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
     BlockBackend *blk = NULL;
     BlockDriverState *bs = NULL;
     BlockDriverState *data_bs = NULL;
+    BDRVQcow2State *s;
     QCowHeader *header;
     size_t cluster_size;
     int version;
@@ -3424,7 +3425,12 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
         goto out;
     }

+    s = blk_bs(blk)->opaque;
+
+    qemu_co_mutex_lock(&s->lock);
     ret = qcow2_alloc_clusters(blk_bs(blk), 3 * cluster_size);
+    qemu_co_mutex_unlock(&s->lock);
+
     if (ret < 0) {
         error_setg_errno(errp, -ret, "Could not allocate clusters for qcow2 "
                          "header and refcount table");
@@ -3437,7 +3443,6 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)

     /* Set the external data file if necessary */
     if (data_bs) {
-        BDRVQcow2State *s = blk_bs(blk)->opaque;
         s->image_data_file = g_strdup(data_bs->filename);
     }



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3] coroutine: Add qemu_co_mutex_assert_locked()
  2019-10-23 15:26 ` [PATCH 1/3] coroutine: Add qemu_co_mutex_assert_locked() Kevin Wolf
@ 2019-10-24  9:59   ` Denis Lunev
  2019-10-24 10:54     ` Kevin Wolf
  0 siblings, 1 reply; 19+ messages in thread
From: Denis Lunev @ 2019-10-24  9:59 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block
  Cc: psyhomb, michael, Vladimir Sementsov-Ogievskiy, qemu-devel,
	qemu-stable, dgilbert, mreitz, lersek

On 10/23/19 6:26 PM, Kevin Wolf wrote:
> Some functions require that the caller holds a certain CoMutex for them
> to operate correctly. Add a function so that they can assert the lock is
> really held.
>
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  include/qemu/coroutine.h | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
> index 9801e7f5a4..a36bcfe5c8 100644
> --- a/include/qemu/coroutine.h
> +++ b/include/qemu/coroutine.h
> @@ -167,6 +167,13 @@ void coroutine_fn qemu_co_mutex_lock(CoMutex *mutex);
>   */
>  void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex);
>  
> +/**
> + * Assert that the current coroutine holds @mutex.
> + */
> +static inline coroutine_fn void qemu_co_mutex_assert_locked(CoMutex *mutex)
> +{
> +    assert(mutex->locked && mutex->holder == qemu_coroutine_self());
> +}
>  
>  /**
>   * CoQueues are a mechanism to queue coroutines in order to continue executing
I think that we should use atomic_read(&mutex->locked) and require barriers
working with holder.

Den


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock
  2019-10-23 15:26 ` [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock Kevin Wolf
  2019-10-23 15:37   ` Kevin Wolf
@ 2019-10-24 10:01   ` Denis Lunev
  2019-10-24 10:57     ` Kevin Wolf
  1 sibling, 1 reply; 19+ messages in thread
From: Denis Lunev @ 2019-10-24 10:01 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block
  Cc: psyhomb, michael, Vladimir Sementsov-Ogievskiy, qemu-devel,
	qemu-stable, dgilbert, mreitz, lersek

On 10/23/19 6:26 PM, Kevin Wolf wrote:
> qcow2_cache_do_get() requires that s->lock is locked because it can
> yield between picking a cache entry and actually taking ownership of it
> by setting offset and increasing the reference count.
>
> Add an assertion to make sure the caller really holds the lock. The
> function can be called outside of coroutine context, where bdrv_pread
> and flushes become synchronous operations. The lock cannot and need not
> be taken in this case.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/qcow2-cache.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
> index d29b038a67..75b13dad99 100644
> --- a/block/qcow2-cache.c
> +++ b/block/qcow2-cache.c
> @@ -327,6 +327,9 @@ static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
>      int min_lru_index = -1;
>  
>      assert(offset != 0);
> +    if (qemu_in_coroutine()) {
> +        qemu_co_mutex_assert_locked(&s->lock);
> +    }

that is looking not good to me. If this is really requires lock, we should
check for the lock always. In the other hand we could face missed
lock out of coroutine.

Den

>  
>      trace_qcow2_cache_get(qemu_coroutine_self(), c == s->l2_table_cache,
>                            offset, read_from_disk);
> @@ -386,6 +389,8 @@ static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
>          }
>      }
>  
> +    assert(c->entries[i].ref == 0);
> +    assert(c->entries[i].offset == 0);
>      c->entries[i].offset = offset;
>  
>      /* And return the right table */



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()
  2019-10-23 15:26 ` [PATCH 3/3] qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation() Kevin Wolf
@ 2019-10-24 10:46   ` Vladimir Sementsov-Ogievskiy
  2019-10-24 11:17     ` Kevin Wolf
  0 siblings, 1 reply; 19+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-10-24 10:46 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block
  Cc: psyhomb, michael, Denis Lunev, qemu-devel, qemu-stable, dgilbert,
	mreitz, lersek

23.10.2019 18:26, Kevin Wolf wrote:
> qcow2_detect_metadata_preallocation() calls qcow2_get_refcount() which
> requires s->lock to be taken to protect its accesses to the refcount
> table and refcount blocks. However, nothing in this code path actually
> took the lock. This could cause the same cache entry to be used by two
> requests at the same time, for different tables at different offsets,
> resulting in image corruption.
> 
> As it would be preferable to base the detection on consistent data (even
> though it's just heuristics), let's take the lock not only around the
> qcow2_get_refcount() calls, but around the whole function.
> 
> This patch takes the lock in qcow2_co_block_status() earlier and asserts
> in qcow2_detect_metadata_preallocation() that we hold the lock.
> 
> Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
> Cc: qemu-stable@nongnu.org
> Reported-by: Michael Weiser <michael@weiser.dinsnail.net>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Oh, I'm very sorry. I was going to backport this patch, and found that it's
fixed in our downstream long ago, even before last upstream version patch sent :(

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   block/qcow2-refcount.c | 2 ++
>   block/qcow2.c          | 3 ++-
>   2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index ef965d7895..0d64bf5a5e 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -3455,6 +3455,8 @@ int qcow2_detect_metadata_preallocation(BlockDriverState *bs)
>       int64_t i, end_cluster, cluster_count = 0, threshold;
>       int64_t file_length, real_allocation, real_clusters;
>   
> +    qemu_co_mutex_assert_locked(&s->lock);
> +
>       file_length = bdrv_getlength(bs->file->bs);
>       if (file_length < 0) {
>           return file_length;
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 8b05933565..0bc69e6996 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -1916,6 +1916,8 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
>       unsigned int bytes;
>       int status = 0;
>   
> +    qemu_co_mutex_lock(&s->lock);
> +
>       if (!s->metadata_preallocation_checked) {
>           ret = qcow2_detect_metadata_preallocation(bs);
>           s->metadata_preallocation = (ret == 1);
> @@ -1923,7 +1925,6 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
>       }
>   
>       bytes = MIN(INT_MAX, count);
> -    qemu_co_mutex_lock(&s->lock);
>       ret = qcow2_get_cluster_offset(bs, offset, &bytes, &cluster_offset);
>       qemu_co_mutex_unlock(&s->lock);
>       if (ret < 0) {
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3] coroutine: Add qemu_co_mutex_assert_locked()
  2019-10-24  9:59   ` Denis Lunev
@ 2019-10-24 10:54     ` Kevin Wolf
  2019-10-24 11:11       ` Denis Lunev
  0 siblings, 1 reply; 19+ messages in thread
From: Kevin Wolf @ 2019-10-24 10:54 UTC (permalink / raw)
  To: Denis Lunev
  Cc: psyhomb, michael, Vladimir Sementsov-Ogievskiy, qemu-block,
	qemu-devel, qemu-stable, dgilbert, mreitz, lersek

Am 24.10.2019 um 11:59 hat Denis Lunev geschrieben:
> On 10/23/19 6:26 PM, Kevin Wolf wrote:
> > Some functions require that the caller holds a certain CoMutex for them
> > to operate correctly. Add a function so that they can assert the lock is
> > really held.
> >
> > Cc: qemu-stable@nongnu.org
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  include/qemu/coroutine.h | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
> > index 9801e7f5a4..a36bcfe5c8 100644
> > --- a/include/qemu/coroutine.h
> > +++ b/include/qemu/coroutine.h
> > @@ -167,6 +167,13 @@ void coroutine_fn qemu_co_mutex_lock(CoMutex *mutex);
> >   */
> >  void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex);
> >  
> > +/**
> > + * Assert that the current coroutine holds @mutex.
> > + */
> > +static inline coroutine_fn void qemu_co_mutex_assert_locked(CoMutex *mutex)
> > +{
> > +    assert(mutex->locked && mutex->holder == qemu_coroutine_self());
> > +}
> >  
> >  /**
> >   * CoQueues are a mechanism to queue coroutines in order to continue executing
> I think that we should use atomic_read(&mutex->locked) and require barriers
> working with holder.

Hm, this would only be necessary for the case that the assertion doesn't
hold true. I'll do the atomic_read() because it's easy enough, but I
don't think we need or want barriers here. If another thread modifies
this concurrently, the condition is false either way.

Kevin



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock
  2019-10-24 10:01   ` Denis Lunev
@ 2019-10-24 10:57     ` Kevin Wolf
  2019-10-24 11:14       ` Denis Lunev
  2019-10-24 13:03       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 2 replies; 19+ messages in thread
From: Kevin Wolf @ 2019-10-24 10:57 UTC (permalink / raw)
  To: Denis Lunev
  Cc: psyhomb, michael, Vladimir Sementsov-Ogievskiy, qemu-block,
	qemu-devel, qemu-stable, dgilbert, mreitz, lersek

Am 24.10.2019 um 12:01 hat Denis Lunev geschrieben:
> On 10/23/19 6:26 PM, Kevin Wolf wrote:
> > qcow2_cache_do_get() requires that s->lock is locked because it can
> > yield between picking a cache entry and actually taking ownership of it
> > by setting offset and increasing the reference count.
> >
> > Add an assertion to make sure the caller really holds the lock. The
> > function can be called outside of coroutine context, where bdrv_pread
> > and flushes become synchronous operations. The lock cannot and need not
> > be taken in this case.
> >
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  block/qcow2-cache.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
> > index d29b038a67..75b13dad99 100644
> > --- a/block/qcow2-cache.c
> > +++ b/block/qcow2-cache.c
> > @@ -327,6 +327,9 @@ static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
> >      int min_lru_index = -1;
> >  
> >      assert(offset != 0);
> > +    if (qemu_in_coroutine()) {
> > +        qemu_co_mutex_assert_locked(&s->lock);
> > +    }
> 
> that is looking not good to me. If this is really requires lock, we should
> check for the lock always. In the other hand we could face missed
> lock out of coroutine.

As the commit message explains, outside of coroutine context, we can't
yield and bdrv_pread and bdrv_flush become synchronous operations
instead, so there is nothing else that we need to protect against.

Kevin



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3] coroutine: Add qemu_co_mutex_assert_locked()
  2019-10-24 10:54     ` Kevin Wolf
@ 2019-10-24 11:11       ` Denis Lunev
  0 siblings, 0 replies; 19+ messages in thread
From: Denis Lunev @ 2019-10-24 11:11 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: psyhomb, michael, Vladimir Sementsov-Ogievskiy, qemu-block,
	qemu-devel, qemu-stable, dgilbert, mreitz, lersek

On 10/24/19 1:54 PM, Kevin Wolf wrote:
> Am 24.10.2019 um 11:59 hat Denis Lunev geschrieben:
>> On 10/23/19 6:26 PM, Kevin Wolf wrote:
>>> Some functions require that the caller holds a certain CoMutex for them
>>> to operate correctly. Add a function so that they can assert the lock is
>>> really held.
>>>
>>> Cc: qemu-stable@nongnu.org
>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>>> ---
>>>  include/qemu/coroutine.h | 7 +++++++
>>>  1 file changed, 7 insertions(+)
>>>
>>> diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
>>> index 9801e7f5a4..a36bcfe5c8 100644
>>> --- a/include/qemu/coroutine.h
>>> +++ b/include/qemu/coroutine.h
>>> @@ -167,6 +167,13 @@ void coroutine_fn qemu_co_mutex_lock(CoMutex *mutex);
>>>   */
>>>  void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex);
>>>  
>>> +/**
>>> + * Assert that the current coroutine holds @mutex.
>>> + */
>>> +static inline coroutine_fn void qemu_co_mutex_assert_locked(CoMutex *mutex)
>>> +{
>>> +    assert(mutex->locked && mutex->holder == qemu_coroutine_self());
>>> +}
>>>  
>>>  /**
>>>   * CoQueues are a mechanism to queue coroutines in order to continue executing
>> I think that we should use atomic_read(&mutex->locked) and require barriers
>> working with holder.
> Hm, this would only be necessary for the case that the assertion doesn't
> hold true. I'll do the atomic_read() because it's easy enough, but I
> don't think we need or want barriers here. If another thread modifies
> this concurrently, the condition is false either way.
>
> Kevin
>
It looks like you are correct. We will see either NULL or something other
if we are in the process of lock taking. Does this worth a comment? :)

Den


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock
  2019-10-24 10:57     ` Kevin Wolf
@ 2019-10-24 11:14       ` Denis Lunev
  2019-10-24 12:07         ` Kevin Wolf
  2019-10-24 13:03       ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 19+ messages in thread
From: Denis Lunev @ 2019-10-24 11:14 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: psyhomb, michael, Vladimir Sementsov-Ogievskiy, qemu-block,
	qemu-devel, qemu-stable, dgilbert, mreitz, lersek

On 10/24/19 1:57 PM, Kevin Wolf wrote:
> Am 24.10.2019 um 12:01 hat Denis Lunev geschrieben:
>> On 10/23/19 6:26 PM, Kevin Wolf wrote:
>>> qcow2_cache_do_get() requires that s->lock is locked because it can
>>> yield between picking a cache entry and actually taking ownership of it
>>> by setting offset and increasing the reference count.
>>>
>>> Add an assertion to make sure the caller really holds the lock. The
>>> function can be called outside of coroutine context, where bdrv_pread
>>> and flushes become synchronous operations. The lock cannot and need not
>>> be taken in this case.
>>>
>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>>> ---
>>>  block/qcow2-cache.c | 5 +++++
>>>  1 file changed, 5 insertions(+)
>>>
>>> diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
>>> index d29b038a67..75b13dad99 100644
>>> --- a/block/qcow2-cache.c
>>> +++ b/block/qcow2-cache.c
>>> @@ -327,6 +327,9 @@ static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
>>>      int min_lru_index = -1;
>>>  
>>>      assert(offset != 0);
>>> +    if (qemu_in_coroutine()) {
>>> +        qemu_co_mutex_assert_locked(&s->lock);
>>> +    }
>> that is looking not good to me. If this is really requires lock, we should
>> check for the lock always. In the other hand we could face missed
>> lock out of coroutine.
> As the commit message explains, outside of coroutine context, we can't
> yield and bdrv_pread and bdrv_flush become synchronous operations
> instead, so there is nothing else that we need to protect against.
>
> Kevin
>
Hmm. It seems I was not careful enough with reading entire message.
I am fine with this though it looks a bit tricky to me as such things
can change in the future.

Anyway, you could consider this as

Reviewed-by: Denis V. Lunev <den@openvz.org>

Den


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()
  2019-10-24 10:46   ` Vladimir Sementsov-Ogievskiy
@ 2019-10-24 11:17     ` Kevin Wolf
  2019-10-24 12:41       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 19+ messages in thread
From: Kevin Wolf @ 2019-10-24 11:17 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: psyhomb, michael, Denis Lunev, qemu-block, qemu-devel,
	qemu-stable, dgilbert, mreitz, lersek

Am 24.10.2019 um 12:46 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 23.10.2019 18:26, Kevin Wolf wrote:
> > qcow2_detect_metadata_preallocation() calls qcow2_get_refcount() which
> > requires s->lock to be taken to protect its accesses to the refcount
> > table and refcount blocks. However, nothing in this code path actually
> > took the lock. This could cause the same cache entry to be used by two
> > requests at the same time, for different tables at different offsets,
> > resulting in image corruption.
> > 
> > As it would be preferable to base the detection on consistent data (even
> > though it's just heuristics), let's take the lock not only around the
> > qcow2_get_refcount() calls, but around the whole function.
> > 
> > This patch takes the lock in qcow2_co_block_status() earlier and asserts
> > in qcow2_detect_metadata_preallocation() that we hold the lock.
> > 
> > Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
> > Cc: qemu-stable@nongnu.org
> > Reported-by: Michael Weiser <michael@weiser.dinsnail.net>
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> 
> Oh, I'm very sorry. I was going to backport this patch, and found that it's
> fixed in our downstream long ago, even before last upstream version patch sent :(

Seriously? Is your downstream QEMU so divergent from upstream that you
wouldn't notice something like this? I think you really have to improve
something there.

I mean, we have a serious data corruptor in the 4.1 release and I wasted
days to debug this, and you've been sitting on a fix for months without
telling anyone? This isn't a memory leak or something, this kills
people's images. It's eating their data.

Modifying an image format driver requires utmost care and I think I'll
try to make sure to allow only few qcow2 changes per release in the
future. Too many changes were made in too short time, and with too
little care, and there are even more qcow2 patches pending. Please check
whether these series actually match your downstream code. Anyway, we'll
tread very carefully now with the pending patches, even if it means
delaying them for another release or two. Stability is way more
important for an image format driver than new features and
optimisations.

Do whatever you need to fix your downstream process, but seriously, this
must not ever happen again.

Kevin



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock
  2019-10-24 11:14       ` Denis Lunev
@ 2019-10-24 12:07         ` Kevin Wolf
  0 siblings, 0 replies; 19+ messages in thread
From: Kevin Wolf @ 2019-10-24 12:07 UTC (permalink / raw)
  To: Denis Lunev
  Cc: psyhomb, michael, Vladimir Sementsov-Ogievskiy, qemu-block,
	qemu-devel, qemu-stable, dgilbert, mreitz, lersek

Am 24.10.2019 um 13:14 hat Denis Lunev geschrieben:
> On 10/24/19 1:57 PM, Kevin Wolf wrote:
> > Am 24.10.2019 um 12:01 hat Denis Lunev geschrieben:
> >> On 10/23/19 6:26 PM, Kevin Wolf wrote:
> >>> qcow2_cache_do_get() requires that s->lock is locked because it can
> >>> yield between picking a cache entry and actually taking ownership of it
> >>> by setting offset and increasing the reference count.
> >>>
> >>> Add an assertion to make sure the caller really holds the lock. The
> >>> function can be called outside of coroutine context, where bdrv_pread
> >>> and flushes become synchronous operations. The lock cannot and need not
> >>> be taken in this case.
> >>>
> >>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> >>> ---
> >>>  block/qcow2-cache.c | 5 +++++
> >>>  1 file changed, 5 insertions(+)
> >>>
> >>> diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
> >>> index d29b038a67..75b13dad99 100644
> >>> --- a/block/qcow2-cache.c
> >>> +++ b/block/qcow2-cache.c
> >>> @@ -327,6 +327,9 @@ static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
> >>>      int min_lru_index = -1;
> >>>  
> >>>      assert(offset != 0);
> >>> +    if (qemu_in_coroutine()) {
> >>> +        qemu_co_mutex_assert_locked(&s->lock);
> >>> +    }
> >> that is looking not good to me. If this is really requires lock, we should
> >> check for the lock always. In the other hand we could face missed
> >> lock out of coroutine.
> > As the commit message explains, outside of coroutine context, we can't
> > yield and bdrv_pread and bdrv_flush become synchronous operations
> > instead, so there is nothing else that we need to protect against.
> >
> Hmm. It seems I was not careful enough with reading entire message.
> I am fine with this though it looks a bit tricky to me as such things
> can change in the future.

In which way do you think this could change? It's a pretty fundamental
fact about non-coroutine code that it can't yield.

What could change, of course, is that some code switches from being
synchronous to using a coroutine. The assertion would automatically
apply then and catch the bug if adding proper locking is forgotten.

> Anyway, you could consider this as
> 
> Reviewed-by: Denis V. Lunev <den@openvz.org>

Thanks!

Kevin



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()
  2019-10-24 11:17     ` Kevin Wolf
@ 2019-10-24 12:41       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 19+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-10-24 12:41 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: psyhomb, michael, Denis Lunev, qemu-block, qemu-devel,
	qemu-stable, dgilbert, mreitz, lersek

24.10.2019 14:17, Kevin Wolf wrote:
> Am 24.10.2019 um 12:46 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 23.10.2019 18:26, Kevin Wolf wrote:
>>> qcow2_detect_metadata_preallocation() calls qcow2_get_refcount() which
>>> requires s->lock to be taken to protect its accesses to the refcount
>>> table and refcount blocks. However, nothing in this code path actually
>>> took the lock. This could cause the same cache entry to be used by two
>>> requests at the same time, for different tables at different offsets,
>>> resulting in image corruption.
>>>
>>> As it would be preferable to base the detection on consistent data (even
>>> though it's just heuristics), let's take the lock not only around the
>>> qcow2_get_refcount() calls, but around the whole function.
>>>
>>> This patch takes the lock in qcow2_co_block_status() earlier and asserts
>>> in qcow2_detect_metadata_preallocation() that we hold the lock.
>>>
>>> Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
>>> Cc: qemu-stable@nongnu.org
>>> Reported-by: Michael Weiser <michael@weiser.dinsnail.net>
>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>>
>> Oh, I'm very sorry. I was going to backport this patch, and found that it's
>> fixed in our downstream long ago, even before last upstream version patch sent :(
> 
> Seriously? Is your downstream QEMU so divergent from upstream that you
> wouldn't notice something like this? I think you really have to improve
> something there.

How to improve it? Months and years are passed to bring patches into upstream,
of course we have to live with a bunch (now it's nearer to 300 and this is a lot
better then 500+ in the recent past) of patches above Rhel release.

> 
> I mean, we have a serious data corruptor in the 4.1 release and I wasted
> days to debug this, and you've been sitting on a fix for months without
> telling anyone?

Of course, it was not my goal to hide the fix, I'll do my best to avoid this
in future. Very sorry for your time wasted.

> This isn't a memory leak or something, this kills
> people's images. It's eating their data.

Possibly, I didn't know about data corruption. For some reason I decided that
lock should be taken here, I don't remember. Still I should have
applied it to upstream version too, of course.

> 
> Modifying an image format driver requires utmost care and I think I'll
> try to make sure to allow only few qcow2 changes per release in the
> future. Too many changes were made in too short time, and with too
> little care, and there are even more qcow2 patches pending. Please check
> whether these series actually match your downstream code.

The difference is that I didn't move the lock() but add separate lock()/unlock()
pair around qcow2_detect_metadata_preallocation(). I think there is no serious
difference.

> Anyway, we'll
> tread very carefully now with the pending patches, even if it means
> delaying them for another release or two.

What do you mean? What to do with sent patches during several months?

> Stability is way more
> important for an image format driver than new features and
> optimisations.

Agree. But I think, the main source of stability is testing.

> 
> Do whatever you need to fix your downstream process, but seriously, this
> must not ever happen again.
> 

I don't see how to improve the process to exclude such problems. But I'll
of course will do my best to avoid them.



-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock
  2019-10-24 10:57     ` Kevin Wolf
  2019-10-24 11:14       ` Denis Lunev
@ 2019-10-24 13:03       ` Vladimir Sementsov-Ogievskiy
  2019-10-24 13:17         ` Kevin Wolf
  1 sibling, 1 reply; 19+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-10-24 13:03 UTC (permalink / raw)
  To: Kevin Wolf, Denis Lunev
  Cc: psyhomb, michael, qemu-block, qemu-devel, qemu-stable, dgilbert,
	mreitz, lersek

24.10.2019 13:57, Kevin Wolf wrote:
> Am 24.10.2019 um 12:01 hat Denis Lunev geschrieben:
>> On 10/23/19 6:26 PM, Kevin Wolf wrote:
>>> qcow2_cache_do_get() requires that s->lock is locked because it can
>>> yield between picking a cache entry and actually taking ownership of it
>>> by setting offset and increasing the reference count.
>>>
>>> Add an assertion to make sure the caller really holds the lock. The
>>> function can be called outside of coroutine context, where bdrv_pread
>>> and flushes become synchronous operations. The lock cannot and need not
>>> be taken in this case.
>>>
>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>>> ---
>>>   block/qcow2-cache.c | 5 +++++
>>>   1 file changed, 5 insertions(+)
>>>
>>> diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
>>> index d29b038a67..75b13dad99 100644
>>> --- a/block/qcow2-cache.c
>>> +++ b/block/qcow2-cache.c
>>> @@ -327,6 +327,9 @@ static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
>>>       int min_lru_index = -1;
>>>   
>>>       assert(offset != 0);
>>> +    if (qemu_in_coroutine()) {
>>> +        qemu_co_mutex_assert_locked(&s->lock);
>>> +    }
>>
>> that is looking not good to me. If this is really requires lock, we should
>> check for the lock always. In the other hand we could face missed
>> lock out of coroutine.
> 
> As the commit message explains, outside of coroutine context, we can't
> yield and bdrv_pread and bdrv_flush become synchronous operations
> instead, so there is nothing else that we need to protect against.
> 

Recently we discussed similar problems about block-dirty-bitmap-* qmp commands,
which wanted to update qcow2 metadata about bitmaps from non-coroutine context.
"qcow2 lock"
<135df452-397a-30bb-7518-2184fa5971aa@virtuozzo.com>
https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg01419.html

And, as I understand, the correct way is to convert to coroutine and lock mutex
appropriately. Or we want to lock aio context and to be in drained section to
avoid parallel requests accessing critical section. Should we assert such
conditions in case of !qemu_in_coroutine() ?

--
Final commit to fix bug about bitmaps:

commit d2c3080e41fd2c9bc36c996cc9d33804462ba803
Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Date:   Fri Sep 20 11:25:43 2019 +0300

     block/qcow2: proper locking on bitmap add/remove paths




-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock
  2019-10-24 13:03       ` Vladimir Sementsov-Ogievskiy
@ 2019-10-24 13:17         ` Kevin Wolf
  0 siblings, 0 replies; 19+ messages in thread
From: Kevin Wolf @ 2019-10-24 13:17 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: psyhomb, michael, Denis Lunev, qemu-block, qemu-devel,
	qemu-stable, dgilbert, mreitz, lersek

Am 24.10.2019 um 15:03 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 24.10.2019 13:57, Kevin Wolf wrote:
> > Am 24.10.2019 um 12:01 hat Denis Lunev geschrieben:
> >> On 10/23/19 6:26 PM, Kevin Wolf wrote:
> >>> qcow2_cache_do_get() requires that s->lock is locked because it can
> >>> yield between picking a cache entry and actually taking ownership of it
> >>> by setting offset and increasing the reference count.
> >>>
> >>> Add an assertion to make sure the caller really holds the lock. The
> >>> function can be called outside of coroutine context, where bdrv_pread
> >>> and flushes become synchronous operations. The lock cannot and need not
> >>> be taken in this case.
> >>>
> >>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> >>> ---
> >>>   block/qcow2-cache.c | 5 +++++
> >>>   1 file changed, 5 insertions(+)
> >>>
> >>> diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
> >>> index d29b038a67..75b13dad99 100644
> >>> --- a/block/qcow2-cache.c
> >>> +++ b/block/qcow2-cache.c
> >>> @@ -327,6 +327,9 @@ static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
> >>>       int min_lru_index = -1;
> >>>   
> >>>       assert(offset != 0);
> >>> +    if (qemu_in_coroutine()) {
> >>> +        qemu_co_mutex_assert_locked(&s->lock);
> >>> +    }
> >>
> >> that is looking not good to me. If this is really requires lock, we should
> >> check for the lock always. In the other hand we could face missed
> >> lock out of coroutine.
> > 
> > As the commit message explains, outside of coroutine context, we can't
> > yield and bdrv_pread and bdrv_flush become synchronous operations
> > instead, so there is nothing else that we need to protect against.
> > 
> 
> Recently we discussed similar problems about block-dirty-bitmap-* qmp
> commands, which wanted to update qcow2 metadata about bitmaps from
> non-coroutine context.  "qcow2 lock"
> <135df452-397a-30bb-7518-2184fa5971aa@virtuozzo.com>
> https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg01419.html

Hm, right, I already forgot about the nested event loop again...

> And, as I understand, the correct way is to convert to coroutine and
> lock mutex appropriately. Or we want to lock aio context and to be in
> drained section to avoid parallel requests accessing critical section.
> Should we assert such conditions in case of !qemu_in_coroutine() ?

The AioContext lock must be held anyway, so I don't think this would be
a new requirement. As for draining, I'll have to see.

I'm currently still auditing all the callers of qcow2_cache_do_get().
The synchronous callers I already know are the snapshot functions. I
think these happen to be in a drained section anyway (or should be at
least), so AioContext lock + drain seems like a very reasonable option
for them.

For other synchronous callers, if any, maybe conversion to a coroutine
would make more sense.

Kevin



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock
  2019-10-23 15:37   ` Kevin Wolf
@ 2019-10-25 10:35     ` Michael Weiser
  2019-10-25 12:42       ` Kevin Wolf
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Weiser @ 2019-10-25 10:35 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: psyhomb, vsementsov, den, qemu-block, qemu-devel, qemu-stable,
	dgilbert, mreitz, lersek

Hi Kevin,

On Wed, Oct 23, 2019 at 05:37:49PM +0200, Kevin Wolf wrote:

> > qcow2_cache_do_get() requires that s->lock is locked because it can
> > yield between picking a cache entry and actually taking ownership of it
> > by setting offset and increasing the reference count.
> > 
> > Add an assertion to make sure the caller really holds the lock. The
> > function can be called outside of coroutine context, where bdrv_pread
> > and flushes become synchronous operations. The lock cannot and need not
> > be taken in this case.
> I'm still running tests to see if any other code paths trigger the
> assertion, but image creation calls this without the lock held (which is
> harmless because nobody else knows about the image so there won't be
> concurrent requests). The following patch is needed additionally to make
> image creation work with the new assertion.

I can confirm that with all four patches corruption does no longer
occur as of commit 69f47505ee66afaa513305de0c1895a224e52c45. Removing
only 3/3 (qcow2: Fix corruption bug in
qcow2_detect_metadata_preallocation()) the assertion triggers after a
few seconds, leaving behind a few leaked clusters but no errors in the
image.

(qemu) qemu-system-x86_64:qemu/include/qemu/coroutine.h:175:
qemu_co_mutex_assert_locked: Assertion `mutex->locked && mutex->holder
== qemu_coroutine_self()' failed.
Aborted (core dumped)

$ qemu-img check qtest.qcow2 
Leaked cluster 169257 refcount=3 reference=2
Leaked cluster 172001 refcount=1 reference=0
Leaked cluster 172002 refcount=1 reference=0
Leaked cluster 172003 refcount=1 reference=0
Leaked cluster 172004 refcount=1 reference=0
Leaked cluster 172005 refcount=1 reference=0
Leaked cluster 172006 refcount=1 reference=0
Leaked cluster 172007 refcount=1 reference=0
Leaked cluster 172008 refcount=1 reference=0
Leaked cluster 172009 refcount=1 reference=0
Leaked cluster 172010 refcount=1 reference=0
Leaked cluster 172011 refcount=1 reference=0
Leaked cluster 172012 refcount=1 reference=0

13 leaked clusters were found on the image.
This means waste of disk space, but no harm to data.
255525/327680 = 77.98% allocated, 3.22% fragmented, 0.00% compressed
clusters
Image end offset: 17106403328

I was going to test with master as well but got overtaken by v2. Will
move on to test v2 now. :)

Series:
Tested-by: Michael Weiser <michael.weiser@gmx.de>

No biggie but if there's a chance could you switch my address to the
above?
-- 
Thanks,
Michael


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock
  2019-10-25 10:35     ` Michael Weiser
@ 2019-10-25 12:42       ` Kevin Wolf
  0 siblings, 0 replies; 19+ messages in thread
From: Kevin Wolf @ 2019-10-25 12:42 UTC (permalink / raw)
  To: Michael Weiser
  Cc: psyhomb, vsementsov, den, qemu-block, qemu-devel, qemu-stable,
	dgilbert, mreitz, lersek

Am 25.10.2019 um 12:35 hat Michael Weiser geschrieben:
> I was going to test with master as well but got overtaken by v2. Will
> move on to test v2 now. :)
> 
> Series:
> Tested-by: Michael Weiser <michael.weiser@gmx.de>

Thanks for testing!

The fix itself is unchanged in v2, so I assume the result will be the
same, but testing it explicitly can't hurt. I'm going to send a pull
request today, but if you're quick, I can wait for your results.

> No biggie but if there's a chance could you switch my address to the
> above?

No problem, I've updated the address in the Reported-by tag.

Kevin



^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-10-25 12:43 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-23 15:26 [PATCH 0/3] qcow2: Fix image corruption bug in 4.1 Kevin Wolf
2019-10-23 15:26 ` [PATCH 1/3] coroutine: Add qemu_co_mutex_assert_locked() Kevin Wolf
2019-10-24  9:59   ` Denis Lunev
2019-10-24 10:54     ` Kevin Wolf
2019-10-24 11:11       ` Denis Lunev
2019-10-23 15:26 ` [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock Kevin Wolf
2019-10-23 15:37   ` Kevin Wolf
2019-10-25 10:35     ` Michael Weiser
2019-10-25 12:42       ` Kevin Wolf
2019-10-24 10:01   ` Denis Lunev
2019-10-24 10:57     ` Kevin Wolf
2019-10-24 11:14       ` Denis Lunev
2019-10-24 12:07         ` Kevin Wolf
2019-10-24 13:03       ` Vladimir Sementsov-Ogievskiy
2019-10-24 13:17         ` Kevin Wolf
2019-10-23 15:26 ` [PATCH 3/3] qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation() Kevin Wolf
2019-10-24 10:46   ` Vladimir Sementsov-Ogievskiy
2019-10-24 11:17     ` Kevin Wolf
2019-10-24 12:41       ` Vladimir Sementsov-Ogievskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.