All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] qcow2: Fix image corruption bug in 4.1
@ 2019-10-24 14:26 Kevin Wolf
  2019-10-24 14:26 ` [PATCH v2 1/2] coroutine: Add qemu_co_mutex_assert_locked() Kevin Wolf
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Kevin Wolf @ 2019-10-24 14:26 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, psyhomb, michael, vsementsov, den, qemu-devel,
	qemu-stable, dgilbert, mreitz, lersek

This series fixes an image corruption bug that was introduced in commit
69f47505e ('block: avoid recursive block_status call if possible'),
first contained in the QEMU 4.1.0 release.

This bug was reported by Michael Weiser on Launchpad:
https://bugs.launchpad.net/qemu/+bug/1846427

v2:

- Dropped the assertion in qcow2_cache_do_get() for now. Making sure
  that it actually holds true for all callers requires more work and
  getting the corruption fix in quickly is important.

- Use atomic_read() and add comment to qemu_co_mutex_assert_locked()
  implementation [Denis]

Kevin Wolf (2):
  coroutine: Add qemu_co_mutex_assert_locked()
  qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()

 include/qemu/coroutine.h | 15 +++++++++++++++
 block/qcow2-refcount.c   |  2 ++
 block/qcow2.c            |  3 ++-
 3 files changed, 19 insertions(+), 1 deletion(-)

-- 
2.20.1



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/2] coroutine: Add qemu_co_mutex_assert_locked()
  2019-10-24 14:26 [PATCH v2 0/2] qcow2: Fix image corruption bug in 4.1 Kevin Wolf
@ 2019-10-24 14:26 ` Kevin Wolf
  2019-10-24 15:35   ` Vladimir Sementsov-Ogievskiy
  2019-10-24 14:26 ` [PATCH v2 2/2] qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation() Kevin Wolf
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Kevin Wolf @ 2019-10-24 14:26 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, psyhomb, michael, vsementsov, den, qemu-devel,
	qemu-stable, dgilbert, mreitz, lersek

Some functions require that the caller holds a certain CoMutex for them
to operate correctly. Add a function so that they can assert the lock is
really held.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 include/qemu/coroutine.h | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index 9801e7f5a4..f4843b5f59 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -167,6 +167,21 @@ void coroutine_fn qemu_co_mutex_lock(CoMutex *mutex);
  */
 void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex);
 
+/**
+ * Assert that the current coroutine holds @mutex.
+ */
+static inline coroutine_fn void qemu_co_mutex_assert_locked(CoMutex *mutex)
+{
+    /*
+     * mutex->holder doesn't need any synchronisation if the assertion holds
+     * true because the mutex protects it. If it doesn't hold true, we still
+     * don't mind if another thread takes or releases mutex behind our back,
+     * because the condition will be false no matter whether we read NULL or
+     * the pointer for any other coroutine.
+     */
+    assert(atomic_read(&mutex->locked) &&
+           mutex->holder == qemu_coroutine_self());
+}
 
 /**
  * CoQueues are a mechanism to queue coroutines in order to continue executing
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/2] qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()
  2019-10-24 14:26 [PATCH v2 0/2] qcow2: Fix image corruption bug in 4.1 Kevin Wolf
  2019-10-24 14:26 ` [PATCH v2 1/2] coroutine: Add qemu_co_mutex_assert_locked() Kevin Wolf
@ 2019-10-24 14:26 ` Kevin Wolf
  2019-10-25 10:44 ` [PATCH v2 0/2] qcow2: Fix image corruption bug in 4.1 Max Reitz
  2019-10-25 14:42 ` Michael Weiser
  3 siblings, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2019-10-24 14:26 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, psyhomb, michael, vsementsov, den, qemu-devel,
	qemu-stable, dgilbert, mreitz, lersek

qcow2_detect_metadata_preallocation() calls qcow2_get_refcount() which
requires s->lock to be taken to protect its accesses to the refcount
table and refcount blocks. However, nothing in this code path actually
took the lock. This could cause the same cache entry to be used by two
requests at the same time, for different tables at different offsets,
resulting in image corruption.

As it would be preferable to base the detection on consistent data (even
though it's just heuristics), let's take the lock not only around the
qcow2_get_refcount() calls, but around the whole function.

This patch takes the lock in qcow2_co_block_status() earlier and asserts
in qcow2_detect_metadata_preallocation() that we hold the lock.

Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
Cc: qemu-stable@nongnu.org
Reported-by: Michael Weiser <michael@weiser.dinsnail.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2-refcount.c | 2 ++
 block/qcow2.c          | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index ef965d7895..0d64bf5a5e 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -3455,6 +3455,8 @@ int qcow2_detect_metadata_preallocation(BlockDriverState *bs)
     int64_t i, end_cluster, cluster_count = 0, threshold;
     int64_t file_length, real_allocation, real_clusters;
 
+    qemu_co_mutex_assert_locked(&s->lock);
+
     file_length = bdrv_getlength(bs->file->bs);
     if (file_length < 0) {
         return file_length;
diff --git a/block/qcow2.c b/block/qcow2.c
index 8b05933565..0bc69e6996 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1916,6 +1916,8 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
     unsigned int bytes;
     int status = 0;
 
+    qemu_co_mutex_lock(&s->lock);
+
     if (!s->metadata_preallocation_checked) {
         ret = qcow2_detect_metadata_preallocation(bs);
         s->metadata_preallocation = (ret == 1);
@@ -1923,7 +1925,6 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
     }
 
     bytes = MIN(INT_MAX, count);
-    qemu_co_mutex_lock(&s->lock);
     ret = qcow2_get_cluster_offset(bs, offset, &bytes, &cluster_offset);
     qemu_co_mutex_unlock(&s->lock);
     if (ret < 0) {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] coroutine: Add qemu_co_mutex_assert_locked()
  2019-10-24 14:26 ` [PATCH v2 1/2] coroutine: Add qemu_co_mutex_assert_locked() Kevin Wolf
@ 2019-10-24 15:35   ` Vladimir Sementsov-Ogievskiy
  2019-10-24 15:50     ` Denis Lunev
  0 siblings, 1 reply; 7+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-10-24 15:35 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block
  Cc: psyhomb, michael, Denis Lunev, qemu-devel, qemu-stable, dgilbert,
	mreitz, lersek

24.10.2019 17:26, Kevin Wolf wrote:
> Some functions require that the caller holds a certain CoMutex for them
> to operate correctly. Add a function so that they can assert the lock is
> really held.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   include/qemu/coroutine.h | 15 +++++++++++++++
>   1 file changed, 15 insertions(+)
> 
> diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
> index 9801e7f5a4..f4843b5f59 100644
> --- a/include/qemu/coroutine.h
> +++ b/include/qemu/coroutine.h
> @@ -167,6 +167,21 @@ void coroutine_fn qemu_co_mutex_lock(CoMutex *mutex);
>    */
>   void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex);
>   
> +/**
> + * Assert that the current coroutine holds @mutex.
> + */
> +static inline coroutine_fn void qemu_co_mutex_assert_locked(CoMutex *mutex)
> +{
> +    /*
> +     * mutex->holder doesn't need any synchronisation if the assertion holds
> +     * true because the mutex protects it. If it doesn't hold true, we still
> +     * don't mind if another thread takes or releases mutex behind our back,
> +     * because the condition will be false no matter whether we read NULL or
> +     * the pointer for any other coroutine.
> +     */
> +    assert(atomic_read(&mutex->locked) &&
> +           mutex->holder == qemu_coroutine_self());
> +}
>   
>   /**
>    * CoQueues are a mechanism to queue coroutines in order to continue executing
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] coroutine: Add qemu_co_mutex_assert_locked()
  2019-10-24 15:35   ` Vladimir Sementsov-Ogievskiy
@ 2019-10-24 15:50     ` Denis Lunev
  0 siblings, 0 replies; 7+ messages in thread
From: Denis Lunev @ 2019-10-24 15:50 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Kevin Wolf, qemu-block
  Cc: psyhomb, michael, qemu-stable, qemu-devel, dgilbert, mreitz, lersek

On 10/24/19 6:35 PM, Vladimir Sementsov-Ogievskiy wrote:
> 24.10.2019 17:26, Kevin Wolf wrote:
>> Some functions require that the caller holds a certain CoMutex for them
>> to operate correctly. Add a function so that they can assert the lock is
>> really held.
>>
>> Cc: qemu-stable@nongnu.org
>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/2] qcow2: Fix image corruption bug in 4.1
  2019-10-24 14:26 [PATCH v2 0/2] qcow2: Fix image corruption bug in 4.1 Kevin Wolf
  2019-10-24 14:26 ` [PATCH v2 1/2] coroutine: Add qemu_co_mutex_assert_locked() Kevin Wolf
  2019-10-24 14:26 ` [PATCH v2 2/2] qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation() Kevin Wolf
@ 2019-10-25 10:44 ` Max Reitz
  2019-10-25 14:42 ` Michael Weiser
  3 siblings, 0 replies; 7+ messages in thread
From: Max Reitz @ 2019-10-25 10:44 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block
  Cc: psyhomb, michael, vsementsov, den, qemu-stable, dgilbert,
	qemu-devel, lersek


[-- Attachment #1.1: Type: text/plain, Size: 1034 bytes --]

On 24.10.19 16:26, Kevin Wolf wrote:
> This series fixes an image corruption bug that was introduced in commit
> 69f47505e ('block: avoid recursive block_status call if possible'),
> first contained in the QEMU 4.1.0 release.
> 
> This bug was reported by Michael Weiser on Launchpad:
> https://bugs.launchpad.net/qemu/+bug/1846427
> 
> v2:
> 
> - Dropped the assertion in qcow2_cache_do_get() for now. Making sure
>   that it actually holds true for all callers requires more work and
>   getting the corruption fix in quickly is important.
> 
> - Use atomic_read() and add comment to qemu_co_mutex_assert_locked()
>   implementation [Denis]
> 
> Kevin Wolf (2):
>   coroutine: Add qemu_co_mutex_assert_locked()
>   qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()
> 
>  include/qemu/coroutine.h | 15 +++++++++++++++
>  block/qcow2-refcount.c   |  2 ++
>  block/qcow2.c            |  3 ++-
>  3 files changed, 19 insertions(+), 1 deletion(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/2] qcow2: Fix image corruption bug in 4.1
  2019-10-24 14:26 [PATCH v2 0/2] qcow2: Fix image corruption bug in 4.1 Kevin Wolf
                   ` (2 preceding siblings ...)
  2019-10-25 10:44 ` [PATCH v2 0/2] qcow2: Fix image corruption bug in 4.1 Max Reitz
@ 2019-10-25 14:42 ` Michael Weiser
  3 siblings, 0 replies; 7+ messages in thread
From: Michael Weiser @ 2019-10-25 14:42 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: psyhomb, vsementsov, den, qemu-block, qemu-devel, qemu-stable,
	dgilbert, mreitz, lersek

Hello Kevin,

On Thu, Oct 24, 2019 at 04:26:56PM +0200, Kevin Wolf wrote:

> Kevin Wolf (2):
>   coroutine: Add qemu_co_mutex_assert_locked()
>   qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()

Tested-by: Michael Weiser <michael.weiser@gmx.de>

with offending 69f47505e and today's master
(58560ad254fbda71d4daa6622d71683190070ee2).

Corruption does not happen with series applied.
Assertion tiggers as expected if lock is not taken.

FWIW: Reviewed-by: Michael Weiser <michael.weiser@gmx.de>
-- 
Thanks,
Michael


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-10-25 14:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-24 14:26 [PATCH v2 0/2] qcow2: Fix image corruption bug in 4.1 Kevin Wolf
2019-10-24 14:26 ` [PATCH v2 1/2] coroutine: Add qemu_co_mutex_assert_locked() Kevin Wolf
2019-10-24 15:35   ` Vladimir Sementsov-Ogievskiy
2019-10-24 15:50     ` Denis Lunev
2019-10-24 14:26 ` [PATCH v2 2/2] qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation() Kevin Wolf
2019-10-25 10:44 ` [PATCH v2 0/2] qcow2: Fix image corruption bug in 4.1 Max Reitz
2019-10-25 14:42 ` Michael Weiser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.