All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] qed: do not evict in-use L2 table cache entries
@ 2012-02-27 13:16 Stefan Hajnoczi
  2012-03-01 11:04 ` Stefan Hajnoczi
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2012-02-27 13:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, qemu-stable, Stefan Hajnoczi

The L2 table cache reduces QED metadata reads that would be required
when translating LBAs to offsets into the image file.  Since requests
execute in parallel it is possible to share an L2 table between multiple
requests.

There is a potential data corruption issue when an in-use L2 table is
evicted from the cache because the following situation occurs:

  1. An allocating write performs an update to L2 table "A".

  2. Another request needs L2 table "B" and causes table "A" to be
     evicted.

  3. A new read request needs L2 table "A" but it is not cached.

As a result the L2 update from #1 can overlap with the L2 fetch from #3.
We must avoid doing overlapping I/O requests here since the worst case
outcome is that the L2 fetch completes before the L2 update and yields
stale data.  In that case we would effectively discard the L2 update and
lose data clusters!

Thanks to Benoît Canet <benoit.canet@gmail.com> for extensive testing
and debugging which lead to discovery of this bug.

Reported-by: Benoît Canet <benoit.canet@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
Please include this in -stable once it has been merged into qemu.git/master.

 block/qed-l2-cache.c |   22 ++++++++++++++++++----
 1 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/block/qed-l2-cache.c b/block/qed-l2-cache.c
index 02b81a2..e9b2aae 100644
--- a/block/qed-l2-cache.c
+++ b/block/qed-l2-cache.c
@@ -161,11 +161,25 @@ void qed_commit_l2_cache_entry(L2TableCache *l2_cache, CachedL2Table *l2_table)
         return;
     }
 
+    /* Evict an unused cache entry so we have space.  If all entries are in use
+     * we can grow the cache temporarily and we try to shrink back down later.
+     */
     if (l2_cache->n_entries >= MAX_L2_CACHE_SIZE) {
-        entry = QTAILQ_FIRST(&l2_cache->entries);
-        QTAILQ_REMOVE(&l2_cache->entries, entry, node);
-        l2_cache->n_entries--;
-        qed_unref_l2_cache_entry(entry);
+        CachedL2Table *next;
+        QTAILQ_FOREACH_SAFE(entry, &l2_cache->entries, node, next) {
+            if (entry->ref > 1) {
+                continue;
+            }
+
+            QTAILQ_REMOVE(&l2_cache->entries, entry, node);
+            l2_cache->n_entries--;
+            qed_unref_l2_cache_entry(entry);
+
+            /* Stop evicting when we've shrunk back to max size */
+            if (l2_cache->n_entries < MAX_L2_CACHE_SIZE) {
+                break;
+            }
+        }
     }
 
     l2_cache->n_entries++;
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] qed: do not evict in-use L2 table cache entries
  2012-02-27 13:16 [Qemu-devel] [PATCH] qed: do not evict in-use L2 table cache entries Stefan Hajnoczi
@ 2012-03-01 11:04 ` Stefan Hajnoczi
  2012-03-01 13:11 ` Benoît Canet
  2012-03-01 16:10 ` Kevin Wolf
  2 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2012-03-01 11:04 UTC (permalink / raw)
  To: Benoît Canet; +Cc: Kevin Wolf, qemu-devel, qemu-stable

On Mon, Feb 27, 2012 at 1:16 PM, Stefan Hajnoczi
<stefanha@linux.vnet.ibm.com> wrote:
> The L2 table cache reduces QED metadata reads that would be required
> when translating LBAs to offsets into the image file.  Since requests
> execute in parallel it is possible to share an L2 table between multiple
> requests.
>
> There is a potential data corruption issue when an in-use L2 table is
> evicted from the cache because the following situation occurs:
>
>  1. An allocating write performs an update to L2 table "A".
>
>  2. Another request needs L2 table "B" and causes table "A" to be
>     evicted.
>
>  3. A new read request needs L2 table "A" but it is not cached.
>
> As a result the L2 update from #1 can overlap with the L2 fetch from #3.
> We must avoid doing overlapping I/O requests here since the worst case
> outcome is that the L2 fetch completes before the L2 update and yields
> stale data.  In that case we would effectively discard the L2 update and
> lose data clusters!
>
> Thanks to Benoît Canet <benoit.canet@gmail.com> for extensive testing
> and debugging which lead to discovery of this bug.
>
> Reported-by: Benoît Canet <benoit.canet@gmail.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> ---
> Please include this in -stable once it has been merged into qemu.git/master.
>
>  block/qed-l2-cache.c |   22 ++++++++++++++++++----
>  1 files changed, 18 insertions(+), 4 deletions(-)

Thanks for testing this fix and confirming it works, Benoît.  Feel
free to reply with your Tested-by: line.

Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] qed: do not evict in-use L2 table cache entries
  2012-02-27 13:16 [Qemu-devel] [PATCH] qed: do not evict in-use L2 table cache entries Stefan Hajnoczi
  2012-03-01 11:04 ` Stefan Hajnoczi
@ 2012-03-01 13:11 ` Benoît Canet
  2012-03-01 16:10 ` Kevin Wolf
  2 siblings, 0 replies; 7+ messages in thread
From: Benoît Canet @ 2012-03-01 13:11 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Kevin Wolf, qemu-devel, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 2867 bytes --]

Tested-by: Benoît Canet <benoit.canet@gmail.com>

On Mon, Feb 27, 2012 at 2:16 PM, Stefan Hajnoczi <
stefanha@linux.vnet.ibm.com> wrote:

> The L2 table cache reduces QED metadata reads that would be required
> when translating LBAs to offsets into the image file.  Since requests
> execute in parallel it is possible to share an L2 table between multiple
> requests.
>
> There is a potential data corruption issue when an in-use L2 table is
> evicted from the cache because the following situation occurs:
>
>  1. An allocating write performs an update to L2 table "A".
>
>  2. Another request needs L2 table "B" and causes table "A" to be
>     evicted.
>
>  3. A new read request needs L2 table "A" but it is not cached.
>
> As a result the L2 update from #1 can overlap with the L2 fetch from #3.
> We must avoid doing overlapping I/O requests here since the worst case
> outcome is that the L2 fetch completes before the L2 update and yields
> stale data.  In that case we would effectively discard the L2 update and
> lose data clusters!
>
> Thanks to Benoît Canet <benoit.canet@gmail.com> for extensive testing
> and debugging which lead to discovery of this bug.
>
> Reported-by: Benoît Canet <benoit.canet@gmail.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> ---
> Please include this in -stable once it has been merged into
> qemu.git/master.
>
>  block/qed-l2-cache.c |   22 ++++++++++++++++++----
>  1 files changed, 18 insertions(+), 4 deletions(-)
>
> diff --git a/block/qed-l2-cache.c b/block/qed-l2-cache.c
> index 02b81a2..e9b2aae 100644
> --- a/block/qed-l2-cache.c
> +++ b/block/qed-l2-cache.c
> @@ -161,11 +161,25 @@ void qed_commit_l2_cache_entry(L2TableCache
> *l2_cache, CachedL2Table *l2_table)
>         return;
>     }
>
> +    /* Evict an unused cache entry so we have space.  If all entries are
> in use
> +     * we can grow the cache temporarily and we try to shrink back down
> later.
> +     */
>     if (l2_cache->n_entries >= MAX_L2_CACHE_SIZE) {
> -        entry = QTAILQ_FIRST(&l2_cache->entries);
> -        QTAILQ_REMOVE(&l2_cache->entries, entry, node);
> -        l2_cache->n_entries--;
> -        qed_unref_l2_cache_entry(entry);
> +        CachedL2Table *next;
> +        QTAILQ_FOREACH_SAFE(entry, &l2_cache->entries, node, next) {
> +            if (entry->ref > 1) {
> +                continue;
> +            }
> +
> +            QTAILQ_REMOVE(&l2_cache->entries, entry, node);
> +            l2_cache->n_entries--;
> +            qed_unref_l2_cache_entry(entry);
> +
> +            /* Stop evicting when we've shrunk back to max size */
> +            if (l2_cache->n_entries < MAX_L2_CACHE_SIZE) {
> +                break;
> +            }
> +        }
>     }
>
>     l2_cache->n_entries++;
> --
> 1.7.9
>
>
>

[-- Attachment #2: Type: text/html, Size: 3775 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] qed: do not evict in-use L2 table cache entries
  2012-02-27 13:16 [Qemu-devel] [PATCH] qed: do not evict in-use L2 table cache entries Stefan Hajnoczi
  2012-03-01 11:04 ` Stefan Hajnoczi
  2012-03-01 13:11 ` Benoît Canet
@ 2012-03-01 16:10 ` Kevin Wolf
  2012-03-01 16:22   ` Stefan Hajnoczi
  2 siblings, 1 reply; 7+ messages in thread
From: Kevin Wolf @ 2012-03-01 16:10 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, qemu-stable

Am 27.02.2012 14:16, schrieb Stefan Hajnoczi:
> The L2 table cache reduces QED metadata reads that would be required
> when translating LBAs to offsets into the image file.  Since requests
> execute in parallel it is possible to share an L2 table between multiple
> requests.
> 
> There is a potential data corruption issue when an in-use L2 table is
> evicted from the cache because the following situation occurs:
> 
>   1. An allocating write performs an update to L2 table "A".
> 
>   2. Another request needs L2 table "B" and causes table "A" to be
>      evicted.
> 
>   3. A new read request needs L2 table "A" but it is not cached.
> 
> As a result the L2 update from #1 can overlap with the L2 fetch from #3.
> We must avoid doing overlapping I/O requests here since the worst case
> outcome is that the L2 fetch completes before the L2 update and yields
> stale data.  In that case we would effectively discard the L2 update and
> lose data clusters!
> 
> Thanks to Benoît Canet <benoit.canet@gmail.com> for extensive testing
> and debugging which lead to discovery of this bug.
> 
> Reported-by: Benoît Canet <benoit.canet@gmail.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

Thanks, applied to the block branch.

How about a qemu-iotests case?

Kevin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] qed: do not evict in-use L2 table cache entries
  2012-03-01 16:10 ` Kevin Wolf
@ 2012-03-01 16:22   ` Stefan Hajnoczi
  2012-03-01 16:58     ` Kevin Wolf
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Hajnoczi @ 2012-03-01 16:22 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, qemu-stable

On Thu, Mar 01, 2012 at 05:10:57PM +0100, Kevin Wolf wrote:
> Am 27.02.2012 14:16, schrieb Stefan Hajnoczi:
> > The L2 table cache reduces QED metadata reads that would be required
> > when translating LBAs to offsets into the image file.  Since requests
> > execute in parallel it is possible to share an L2 table between multiple
> > requests.
> > 
> > There is a potential data corruption issue when an in-use L2 table is
> > evicted from the cache because the following situation occurs:
> > 
> >   1. An allocating write performs an update to L2 table "A".
> > 
> >   2. Another request needs L2 table "B" and causes table "A" to be
> >      evicted.
> > 
> >   3. A new read request needs L2 table "A" but it is not cached.
> > 
> > As a result the L2 update from #1 can overlap with the L2 fetch from #3.
> > We must avoid doing overlapping I/O requests here since the worst case
> > outcome is that the L2 fetch completes before the L2 update and yields
> > stale data.  In that case we would effectively discard the L2 update and
> > lose data clusters!
> > 
> > Thanks to Benoît Canet <benoit.canet@gmail.com> for extensive testing
> > and debugging which lead to discovery of this bug.
> > 
> > Reported-by: Benoît Canet <benoit.canet@gmail.com>
> > Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> 
> Thanks, applied to the block branch.
> 
> How about a qemu-iotests case?

The test case is not ready yet.  I started writing one but it is racy
because I haven't introduced a way of controlling AIO issue/complete for
tests.  My next step is to add that.

Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] qed: do not evict in-use L2 table cache entries
  2012-03-01 16:22   ` Stefan Hajnoczi
@ 2012-03-01 16:58     ` Kevin Wolf
  2012-03-05 11:51       ` Stefan Hajnoczi
  0 siblings, 1 reply; 7+ messages in thread
From: Kevin Wolf @ 2012-03-01 16:58 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, qemu-stable

Am 01.03.2012 17:22, schrieb Stefan Hajnoczi:
> On Thu, Mar 01, 2012 at 05:10:57PM +0100, Kevin Wolf wrote:
>> Am 27.02.2012 14:16, schrieb Stefan Hajnoczi:
>>> The L2 table cache reduces QED metadata reads that would be required
>>> when translating LBAs to offsets into the image file.  Since requests
>>> execute in parallel it is possible to share an L2 table between multiple
>>> requests.
>>>
>>> There is a potential data corruption issue when an in-use L2 table is
>>> evicted from the cache because the following situation occurs:
>>>
>>>   1. An allocating write performs an update to L2 table "A".
>>>
>>>   2. Another request needs L2 table "B" and causes table "A" to be
>>>      evicted.
>>>
>>>   3. A new read request needs L2 table "A" but it is not cached.
>>>
>>> As a result the L2 update from #1 can overlap with the L2 fetch from #3.
>>> We must avoid doing overlapping I/O requests here since the worst case
>>> outcome is that the L2 fetch completes before the L2 update and yields
>>> stale data.  In that case we would effectively discard the L2 update and
>>> lose data clusters!
>>>
>>> Thanks to Benoît Canet <benoit.canet@gmail.com> for extensive testing
>>> and debugging which lead to discovery of this bug.
>>>
>>> Reported-by: Benoît Canet <benoit.canet@gmail.com>
>>> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
>>
>> Thanks, applied to the block branch.
>>
>> How about a qemu-iotests case?
> 
> The test case is not ready yet.  I started writing one but it is racy
> because I haven't introduced a way of controlling AIO issue/complete for
> tests.  My next step is to add that.

Will it be specific to image formats using AIO then or is it generic
enough that coroutine-based drivers work with it as well?

Kevin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] qed: do not evict in-use L2 table cache entries
  2012-03-01 16:58     ` Kevin Wolf
@ 2012-03-05 11:51       ` Stefan Hajnoczi
  0 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2012-03-05 11:51 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-stable, Stefan Hajnoczi, qemu-devel

On Thu, Mar 1, 2012 at 4:58 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 01.03.2012 17:22, schrieb Stefan Hajnoczi:
>> On Thu, Mar 01, 2012 at 05:10:57PM +0100, Kevin Wolf wrote:
>>> Am 27.02.2012 14:16, schrieb Stefan Hajnoczi:
>>>> The L2 table cache reduces QED metadata reads that would be required
>>>> when translating LBAs to offsets into the image file.  Since requests
>>>> execute in parallel it is possible to share an L2 table between multiple
>>>> requests.
>>>>
>>>> There is a potential data corruption issue when an in-use L2 table is
>>>> evicted from the cache because the following situation occurs:
>>>>
>>>>   1. An allocating write performs an update to L2 table "A".
>>>>
>>>>   2. Another request needs L2 table "B" and causes table "A" to be
>>>>      evicted.
>>>>
>>>>   3. A new read request needs L2 table "A" but it is not cached.
>>>>
>>>> As a result the L2 update from #1 can overlap with the L2 fetch from #3.
>>>> We must avoid doing overlapping I/O requests here since the worst case
>>>> outcome is that the L2 fetch completes before the L2 update and yields
>>>> stale data.  In that case we would effectively discard the L2 update and
>>>> lose data clusters!
>>>>
>>>> Thanks to Benoît Canet <benoit.canet@gmail.com> for extensive testing
>>>> and debugging which lead to discovery of this bug.
>>>>
>>>> Reported-by: Benoît Canet <benoit.canet@gmail.com>
>>>> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
>>>
>>> Thanks, applied to the block branch.
>>>
>>> How about a qemu-iotests case?
>>
>> The test case is not ready yet.  I started writing one but it is racy
>> because I haven't introduced a way of controlling AIO issue/complete for
>> tests.  My next step is to add that.
>
> Will it be specific to image formats using AIO then or is it generic
> enough that coroutine-based drivers work with it as well?

I don't know yet but it would be nice to support .bdrv_co_*()-based drivers too.

Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-03-05 11:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-27 13:16 [Qemu-devel] [PATCH] qed: do not evict in-use L2 table cache entries Stefan Hajnoczi
2012-03-01 11:04 ` Stefan Hajnoczi
2012-03-01 13:11 ` Benoît Canet
2012-03-01 16:10 ` Kevin Wolf
2012-03-01 16:22   ` Stefan Hajnoczi
2012-03-01 16:58     ` Kevin Wolf
2012-03-05 11:51       ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.