* [PATCH 0/1] Fix qcow2 corruption after addition of subcluster support @ 2020-11-23 15:49 Maxim Levitsky 2020-11-23 15:49 ` [PATCH 1/1] Fix qcow2 corruption on discard Maxim Levitsky 0 siblings, 1 reply; 12+ messages in thread From: Maxim Levitsky @ 2020-11-23 15:49 UTC (permalink / raw) To: qemu-devel Cc: Kevin Wolf, vsementsov, Alberto Garcia, qemu-block, zhang_youjia, Max Reitz, andrey.shinkevich, Maxim Levitsky On this weekend, I had discovered that one of my VMs started to act weird. Due to this, I found out that it and most of the other VMs I have, have grown an qcow2 corruption. So after some bisecting, digging through dumps, and debugging, I think I found the root cause and a fix. In addition to that I would like to raise few points: 1. I had to use qcow2-dump from (*) (it is also on github but without source. wierd...) to examine the L1/L2 tables and refcount tables. It seems that there were few attempts (**), (***) to make an official tool that would dump at least L1/L2/refcount tables, but nothing got accepted so far. I think that an official tool to dump at least basic qcow2 structure would be very helpful to discover/debug qcow2 corruptions. I had to study again the qcow2 format for this, so I can help with that. 2. 'qemu-img check -r all' is happy to create clusters that are referenced from multiple L2 entries. This isn't technically wrong, since write through any of these l2 entries will COW the cluster. However I would be happy to know that my images don't have such clusters, so I would like qemu-img check to at least notify about this. Can we add some -check-weird-but-legal flag to it to check this? Few notes about the condition for this corruption to occur: I have a bunch of VMs which are running each using two qcow2 files, base and a snapshot on top of it, which I 'qemu-img commit' once in a while. Discard is enabled to avoid wasting disk space. Since discard is enabled, 'qemu-img commit' often discards data on the base disk. The corruption happens after such a commit, and manifests in a stale L2 entry that was supposed to be discarded but now points to an unused cluster. I wasn't able to reproduce this on small test case so far. Best regards, Maxim Levitsky (*)https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02760.html (**) https://patchwork.kernel.org/project/qemu-devel/patch/20180328133845.20632-1-berto@igalia.com/ (***) https://patchwork.kernel.org/project/qemu-devel/cover/1578990137-308222-1-git-send-email-andrey.shinkevich@virtuozzo.com/ Maxim Levitsky (1): Fix qcow2 corruption on discard block/qcow2-cluster.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.26.2 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/1] Fix qcow2 corruption on discard 2020-11-23 15:49 [PATCH 0/1] Fix qcow2 corruption after addition of subcluster support Maxim Levitsky @ 2020-11-23 15:49 ` Maxim Levitsky 2020-11-23 16:09 ` Kevin Wolf 2020-11-23 17:38 ` Kevin Wolf 0 siblings, 2 replies; 12+ messages in thread From: Maxim Levitsky @ 2020-11-23 15:49 UTC (permalink / raw) To: qemu-devel Cc: Kevin Wolf, vsementsov, Alberto Garcia, qemu-block, zhang_youjia, Max Reitz, andrey.shinkevich, Maxim Levitsky Commit 205fa50750 ("qcow2: Add subcluster support to zero_in_l2_slice()") introduced a subtle change to code in zero_in_l2_slice: It swapped the order of 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); 2. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); 3. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); To 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); 2. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); 3. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); It seems harmless, however the call to qcow2_free_any_clusters can trigger a cache flush which can mark the L2 table as clean, and assuming that this was the last write to it, a stale version of it will remain on the disk. Now we have a valid L2 entry pointing to a freed cluster. Oops. Fixes: 205fa50750 ("qcow2: Add subcluster support to zero_in_l2_slice()") Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> --- block/qcow2-cluster.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 485b4cb92e..267b46a4ca 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -2010,11 +2010,11 @@ static int zero_in_l2_slice(BlockDriverState *bs, uint64_t offset, continue; } - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); if (unmap) { qcow2_free_any_cluster(bs, old_l2_entry, QCOW2_DISCARD_REQUEST); } set_l2_entry(s, l2_slice, l2_index + i, new_l2_entry); + qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); if (has_subclusters(s)) { set_l2_bitmap(s, l2_slice, l2_index + i, new_l2_bitmap); } -- 2.26.2 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] Fix qcow2 corruption on discard 2020-11-23 15:49 ` [PATCH 1/1] Fix qcow2 corruption on discard Maxim Levitsky @ 2020-11-23 16:09 ` Kevin Wolf 2020-11-23 18:20 ` Alberto Garcia 2020-11-23 17:38 ` Kevin Wolf 1 sibling, 1 reply; 12+ messages in thread From: Kevin Wolf @ 2020-11-23 16:09 UTC (permalink / raw) To: Maxim Levitsky Cc: vsementsov, Alberto Garcia, qemu-block, zhang_youjia, qemu-devel, Max Reitz, andrey.shinkevich Am 23.11.2020 um 16:49 hat Maxim Levitsky geschrieben: > Commit 205fa50750 ("qcow2: Add subcluster support to zero_in_l2_slice()") > introduced a subtle change to code in zero_in_l2_slice: > > It swapped the order of > > 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > 2. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); > 3. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); > > To > > 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > 2. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); > 3. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); > > It seems harmless, however the call to qcow2_free_any_clusters > can trigger a cache flush which can mark the L2 table as clean, > and assuming that this was the last write to it, > a stale version of it will remain on the disk. > > Now we have a valid L2 entry pointing to a freed cluster. Oops. > > Fixes: 205fa50750 ("qcow2: Add subcluster support to zero_in_l2_slice()") > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> > --- > block/qcow2-cluster.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c > index 485b4cb92e..267b46a4ca 100644 > --- a/block/qcow2-cluster.c > +++ b/block/qcow2-cluster.c > @@ -2010,11 +2010,11 @@ static int zero_in_l2_slice(BlockDriverState *bs, uint64_t offset, > continue; > } > > - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > if (unmap) { > qcow2_free_any_cluster(bs, old_l2_entry, QCOW2_DISCARD_REQUEST); > } > set_l2_entry(s, l2_slice, l2_index + i, new_l2_entry); > + qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); Good catch, but I think your order is wrong, too. We need the original order from before 205fa50750: 1. qcow2_cache_entry_mark_dirty() set_l2_entry() + set_l2_bitmap() 2. qcow2_free_any_cluster() The order between qcow2_cache_entry_mark_dirty() and set_l2_entry() shouldn't matter, but it's important that we update the refcount table only after the L2 table has been updated to not reference the cluster any more. Otherwise a crash could lead to a situation where the cluster is allocated (because it has refcount 0), but it was still in use in an L2 table. This is a classic corruption scenario. Kevin ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] Fix qcow2 corruption on discard 2020-11-23 16:09 ` Kevin Wolf @ 2020-11-23 18:20 ` Alberto Garcia 2020-11-23 18:23 ` Maxim Levitsky 0 siblings, 1 reply; 12+ messages in thread From: Alberto Garcia @ 2020-11-23 18:20 UTC (permalink / raw) To: Kevin Wolf, Maxim Levitsky Cc: vsementsov, qemu-block, zhang_youjia, qemu-devel, Max Reitz, andrey.shinkevich On Mon 23 Nov 2020 05:09:41 PM CET, Kevin Wolf wrote: >> Commit 205fa50750 ("qcow2: Add subcluster support to zero_in_l2_slice()") >> introduced a subtle change to code in zero_in_l2_slice: >> >> It swapped the order of >> >> 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); >> 2. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); >> 3. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); >> >> To >> >> 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); >> 2. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); >> 3. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); Ouch :( Good catch! >> - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); >> if (unmap) { >> qcow2_free_any_cluster(bs, old_l2_entry, QCOW2_DISCARD_REQUEST); >> } >> set_l2_entry(s, l2_slice, l2_index + i, new_l2_entry); >> + qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > > Good catch, but I think your order is wrong, too. We need the original > order from before 205fa50750: > > 1. qcow2_cache_entry_mark_dirty() > set_l2_entry() + set_l2_bitmap() > > 2. qcow2_free_any_cluster() I agree with Kevin on this. Berto ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] Fix qcow2 corruption on discard 2020-11-23 18:20 ` Alberto Garcia @ 2020-11-23 18:23 ` Maxim Levitsky 0 siblings, 0 replies; 12+ messages in thread From: Maxim Levitsky @ 2020-11-23 18:23 UTC (permalink / raw) To: Alberto Garcia, Kevin Wolf Cc: vsementsov, qemu-block, zhang_youjia, qemu-devel, Max Reitz, andrey.shinkevich On Mon, 2020-11-23 at 19:20 +0100, Alberto Garcia wrote: > On Mon 23 Nov 2020 05:09:41 PM CET, Kevin Wolf wrote: > > > Commit 205fa50750 ("qcow2: Add subcluster support to zero_in_l2_slice()") > > > introduced a subtle change to code in zero_in_l2_slice: > > > > > > It swapped the order of > > > > > > 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > > > 2. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); > > > 3. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); > > > > > > To > > > > > > 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > > > 2. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); > > > 3. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); > > Ouch :( Good catch! > > > > - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > > > if (unmap) { > > > qcow2_free_any_cluster(bs, old_l2_entry, QCOW2_DISCARD_REQUEST); > > > } > > > set_l2_entry(s, l2_slice, l2_index + i, new_l2_entry); > > > + qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > > > > Good catch, but I think your order is wrong, too. We need the original > > order from before 205fa50750: > > > > 1. qcow2_cache_entry_mark_dirty() > > set_l2_entry() + set_l2_bitmap() > > > > 2. qcow2_free_any_cluster() > > I agree with Kevin on this. I also agree, I haven't thought about this. Best regards, Maxim Levitsky > > Berto > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] Fix qcow2 corruption on discard 2020-11-23 15:49 ` [PATCH 1/1] Fix qcow2 corruption on discard Maxim Levitsky 2020-11-23 16:09 ` Kevin Wolf @ 2020-11-23 17:38 ` Kevin Wolf 2020-11-23 18:11 ` Maxim Levitsky 1 sibling, 1 reply; 12+ messages in thread From: Kevin Wolf @ 2020-11-23 17:38 UTC (permalink / raw) To: Maxim Levitsky Cc: vsementsov, Alberto Garcia, qemu-block, zhang_youjia, qemu-devel, Max Reitz, andrey.shinkevich Am 23.11.2020 um 16:49 hat Maxim Levitsky geschrieben: > Commit 205fa50750 ("qcow2: Add subcluster support to zero_in_l2_slice()") > introduced a subtle change to code in zero_in_l2_slice: > > It swapped the order of > > 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > 2. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); > 3. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); > > To > > 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > 2. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); > 3. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); > > It seems harmless, however the call to qcow2_free_any_clusters > can trigger a cache flush which can mark the L2 table as clean, > and assuming that this was the last write to it, > a stale version of it will remain on the disk. Do you have more details on this last paragraph? I'm trying to come up with a reproducer, but I don't see how qcow2_free_any_clusters() could flush the L2 table cache. (It's easy to get it to flush the refcount block cache, but that's useless for a reproducer.) The only way I see to flush any cache with it is in update_refcount() the qcow2_cache_set_dependency() call. This will always flush the cache that the L2 cache depends on - which will never be the L2 cache itself, but always either the refcount cache or nothing. There are more options in alloc_refcount_block() if we're allocating a new refcount block, but in the context of freeing clusters we'll never need to do that. Whatever I tried, at the end of zero_in_l2_slice(), I have a dirty L2 table and a dirty refcount block in the cache, with a dependency that makes sure that the L2 table will be written out first. If you don't have the information yet, can you try to debug your manual reproducer a bit more to find out how this happens? Kevin > Now we have a valid L2 entry pointing to a freed cluster. Oops. > > Fixes: 205fa50750 ("qcow2: Add subcluster support to zero_in_l2_slice()") > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> > --- > block/qcow2-cluster.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c > index 485b4cb92e..267b46a4ca 100644 > --- a/block/qcow2-cluster.c > +++ b/block/qcow2-cluster.c > @@ -2010,11 +2010,11 @@ static int zero_in_l2_slice(BlockDriverState *bs, uint64_t offset, > continue; > } > > - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > if (unmap) { > qcow2_free_any_cluster(bs, old_l2_entry, QCOW2_DISCARD_REQUEST); > } > set_l2_entry(s, l2_slice, l2_index + i, new_l2_entry); > + qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > if (has_subclusters(s)) { > set_l2_bitmap(s, l2_slice, l2_index + i, new_l2_bitmap); > } > -- > 2.26.2 > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] Fix qcow2 corruption on discard 2020-11-23 17:38 ` Kevin Wolf @ 2020-11-23 18:11 ` Maxim Levitsky 2020-11-24 9:17 ` Kevin Wolf 0 siblings, 1 reply; 12+ messages in thread From: Maxim Levitsky @ 2020-11-23 18:11 UTC (permalink / raw) To: Kevin Wolf Cc: vsementsov, Alberto Garcia, qemu-block, zhang_youjia, qemu-devel, Max Reitz, andrey.shinkevich On Mon, 2020-11-23 at 18:38 +0100, Kevin Wolf wrote: > Am 23.11.2020 um 16:49 hat Maxim Levitsky geschrieben: > > Commit 205fa50750 ("qcow2: Add subcluster support to zero_in_l2_slice()") > > introduced a subtle change to code in zero_in_l2_slice: > > > > It swapped the order of > > > > 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > > 2. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); > > 3. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); > > > > To > > > > 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > > 2. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); > > 3. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); > > > > It seems harmless, however the call to qcow2_free_any_clusters > > can trigger a cache flush which can mark the L2 table as clean, > > and assuming that this was the last write to it, > > a stale version of it will remain on the disk. > > Do you have more details on this last paragraph? I'm trying to come up > with a reproducer, but I don't see how qcow2_free_any_clusters() could > flush the L2 table cache. (It's easy to get it to flush the refcount > block cache, but that's useless for a reproducer.) > > The only way I see to flush any cache with it is in update_refcount() > the qcow2_cache_set_dependency() call. This will always flush the cache > that the L2 cache depends on - which will never be the L2 cache itself, > but always either the refcount cache or nothing. > > There are more options in alloc_refcount_block() if we're allocating a > new refcount block, but in the context of freeing clusters we'll never > need to do that. > > Whatever I tried, at the end of zero_in_l2_slice(), I have a dirty L2 > table and a dirty refcount block in the cache, with a dependency that > makes sure that the L2 table will be written out first. > > If you don't have the information yet, can you try to debug your manual > reproducer a bit more to find out how this happens? I'll do this tomorrow. Best regards, Maxim Levitsky > > Kevin > > > Now we have a valid L2 entry pointing to a freed cluster. Oops. > > > > Fixes: 205fa50750 ("qcow2: Add subcluster support to zero_in_l2_slice()") > > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> > > --- > > block/qcow2-cluster.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c > > index 485b4cb92e..267b46a4ca 100644 > > --- a/block/qcow2-cluster.c > > +++ b/block/qcow2-cluster.c > > @@ -2010,11 +2010,11 @@ static int zero_in_l2_slice(BlockDriverState *bs, uint64_t offset, > > continue; > > } > > > > - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > > if (unmap) { > > qcow2_free_any_cluster(bs, old_l2_entry, QCOW2_DISCARD_REQUEST); > > } > > set_l2_entry(s, l2_slice, l2_index + i, new_l2_entry); > > + qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > > if (has_subclusters(s)) { > > set_l2_bitmap(s, l2_slice, l2_index + i, new_l2_bitmap); > > } > > -- > > 2.26.2 > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] Fix qcow2 corruption on discard 2020-11-23 18:11 ` Maxim Levitsky @ 2020-11-24 9:17 ` Kevin Wolf 2020-11-24 18:59 ` Alberto Garcia 0 siblings, 1 reply; 12+ messages in thread From: Kevin Wolf @ 2020-11-24 9:17 UTC (permalink / raw) To: Maxim Levitsky Cc: vsementsov, Alberto Garcia, qemu-block, zhang_youjia, qemu-devel, Max Reitz, andrey.shinkevich Am 23.11.2020 um 19:11 hat Maxim Levitsky geschrieben: > On Mon, 2020-11-23 at 18:38 +0100, Kevin Wolf wrote: > > Am 23.11.2020 um 16:49 hat Maxim Levitsky geschrieben: > > > Commit 205fa50750 ("qcow2: Add subcluster support to zero_in_l2_slice()") > > > introduced a subtle change to code in zero_in_l2_slice: > > > > > > It swapped the order of > > > > > > 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > > > 2. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); > > > 3. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); > > > > > > To > > > > > > 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > > > 2. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); > > > 3. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); > > > > > > It seems harmless, however the call to qcow2_free_any_clusters > > > can trigger a cache flush which can mark the L2 table as clean, > > > and assuming that this was the last write to it, > > > a stale version of it will remain on the disk. > > > > Do you have more details on this last paragraph? I'm trying to come up > > with a reproducer, but I don't see how qcow2_free_any_clusters() could > > flush the L2 table cache. (It's easy to get it to flush the refcount > > block cache, but that's useless for a reproducer.) > > > > The only way I see to flush any cache with it is in update_refcount() > > the qcow2_cache_set_dependency() call. This will always flush the cache > > that the L2 cache depends on - which will never be the L2 cache itself, > > but always either the refcount cache or nothing. > > > > There are more options in alloc_refcount_block() if we're allocating a > > new refcount block, but in the context of freeing clusters we'll never > > need to do that. > > > > Whatever I tried, at the end of zero_in_l2_slice(), I have a dirty L2 > > table and a dirty refcount block in the cache, with a dependency that > > makes sure that the L2 table will be written out first. > > > > If you don't have the information yet, can you try to debug your manual > > reproducer a bit more to find out how this happens? > I'll do this tomorrow. As the last RC for 5.2 is today, I will send a v2 that changes the fix to restore the original order. We can then continue work to find a minimal reproducer and merge the test case in the early 6.0 cycle. Kevin ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] Fix qcow2 corruption on discard 2020-11-24 9:17 ` Kevin Wolf @ 2020-11-24 18:59 ` Alberto Garcia 2020-11-24 18:59 ` Maxim Levitsky 0 siblings, 1 reply; 12+ messages in thread From: Alberto Garcia @ 2020-11-24 18:59 UTC (permalink / raw) To: Kevin Wolf, Maxim Levitsky Cc: vsementsov, qemu-block, zhang_youjia, qemu-devel, Max Reitz, andrey.shinkevich On Tue 24 Nov 2020 10:17:23 AM CET, Kevin Wolf wrote: > We can then continue work to find a minimal reproducer and merge the > test case in the early 6.0 cycle. I haven't been able to reproduce the problem yet, do you have any findings? Berto ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] Fix qcow2 corruption on discard 2020-11-24 18:59 ` Alberto Garcia @ 2020-11-24 18:59 ` Maxim Levitsky 2020-11-24 19:44 ` Maxim Levitsky 0 siblings, 1 reply; 12+ messages in thread From: Maxim Levitsky @ 2020-11-24 18:59 UTC (permalink / raw) To: Alberto Garcia, Kevin Wolf Cc: vsementsov, qemu-block, zhang_youjia, qemu-devel, Max Reitz, andrey.shinkevich On Tue, 2020-11-24 at 19:59 +0100, Alberto Garcia wrote: > On Tue 24 Nov 2020 10:17:23 AM CET, Kevin Wolf wrote: > > We can then continue work to find a minimal reproducer and merge the > > test case in the early 6.0 cycle. > > I haven't been able to reproduce the problem yet, do you have any > findings? > > Berto > I have a working reproducer script. I'll send it in a hour or so. Best regards, Maxim Levitsky ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] Fix qcow2 corruption on discard 2020-11-24 18:59 ` Maxim Levitsky @ 2020-11-24 19:44 ` Maxim Levitsky 2020-11-25 16:49 ` Alberto Garcia 0 siblings, 1 reply; 12+ messages in thread From: Maxim Levitsky @ 2020-11-24 19:44 UTC (permalink / raw) To: Alberto Garcia, Kevin Wolf Cc: vsementsov, qemu-block, zhang_youjia, qemu-devel, Max Reitz, andrey.shinkevich [-- Attachment #1: Type: text/plain, Size: 2830 bytes --] On Tue, 2020-11-24 at 20:59 +0200, Maxim Levitsky wrote: > On Tue, 2020-11-24 at 19:59 +0100, Alberto Garcia wrote: > > On Tue 24 Nov 2020 10:17:23 AM CET, Kevin Wolf wrote: > > > We can then continue work to find a minimal reproducer and merge the > > > test case in the early 6.0 cycle. > > > > I haven't been able to reproduce the problem yet, do you have any > > findings? > > > > Berto > > > > I have a working reproducer script. I'll send it in a hour or so. > Best regards, > Maxim Levitsky I have attached a minimal reproducer for this issue. I can convert this to an iotest if you think that this is worth it. So these are the exact conditions for the corruption to happen: 1. Image must have at least 5 refcount tables (1 more that default refcount table cache size, which is 4 by default) 2. IO pattern that populates the 4 entry refcount table cache fully: Easiest way to do it is to have 4 L2 entries populated in the base image, such as each entry references a physical cluster that is served by different refcount table. Then discard these entries in the snapshot, triggering discard in the base file during the commit, which will populate the refcount table cache. 4. A discard of a cluster that belongs to 5th refcount table (done in the exact same way as above discards). It should be done soon, before L2 cache flushed due to some unrelated IO. This triggers the corruption: The call stack is: 2. qcow2_free_any_cluster-> qcow2_free_clusters-> update_refcount: //This sets dependency between flushing the refcount cache and l2 cache. if (decrease) qcow2_cache_set_dependency(bs, s->refcount_block_cache,s->l2_table_cache); ret = alloc_refcount_block(bs, cluster_index, &refcount_block); return load_refcount_block(bs, refcount_block_offset, refcount_block); return qcow2_cache_get(... qcow2_cache_do_get /* because of a cache miss, we have to evict an entry*/ ret = qcow2_cache_entry_flush(bs, c, i); if (c->depends) { /* this flushes the L2 cache */ ret = qcow2_cache_flush_dependency(bs, c); } I had attached a reproducer that works with almost any cluster size and refcount block size. Cluster sizes below 4K don't work because commit which is done by the mirror job which works on 4K granularity, and that results in it not doing any discards due to various alignment restrictions. If I patch qemu to make mirror job work on 512B granularity, test reproduces for small clusters as well. The reproducer creates a qcow2 image in the current directory and it needs about 11G for default parameters. (64K cluster size, 16 bit refcounts). For 4K cluster size and 64 bit refcounts, it needs only 11M. (This can be changed by editing the script) Best regards, Maxim Levitsky [-- Attachment #2: reproducer.sh --] [-- Type: application/x-shellscript, Size: 2088 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] Fix qcow2 corruption on discard 2020-11-24 19:44 ` Maxim Levitsky @ 2020-11-25 16:49 ` Alberto Garcia 0 siblings, 0 replies; 12+ messages in thread From: Alberto Garcia @ 2020-11-25 16:49 UTC (permalink / raw) To: Maxim Levitsky, Kevin Wolf Cc: vsementsov, qemu-block, zhang_youjia, qemu-devel, Max Reitz, andrey.shinkevich On Tue 24 Nov 2020 08:44:00 PM CET, Maxim Levitsky wrote: > On Tue, 2020-11-24 at 20:59 +0200, Maxim Levitsky wrote: >> On Tue, 2020-11-24 at 19:59 +0100, Alberto Garcia wrote: >> > On Tue 24 Nov 2020 10:17:23 AM CET, Kevin Wolf wrote: >> > > We can then continue work to find a minimal reproducer and merge the >> > > test case in the early 6.0 cycle. >> > >> > I haven't been able to reproduce the problem yet, do you have any >> > findings? >> > >> > Berto >> > >> >> I have a working reproducer script. I'll send it in a hour or so. >> Best regards, >> Maxim Levitsky > > I have attached a minimal reproducer for this issue. > I can convert this to an iotest if you think that this is worth it. I think it would be a good idea to have an iotest, I can also prepare if you don't have time. Thanks for the script! Berto ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-11-25 16:50 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-11-23 15:49 [PATCH 0/1] Fix qcow2 corruption after addition of subcluster support Maxim Levitsky 2020-11-23 15:49 ` [PATCH 1/1] Fix qcow2 corruption on discard Maxim Levitsky 2020-11-23 16:09 ` Kevin Wolf 2020-11-23 18:20 ` Alberto Garcia 2020-11-23 18:23 ` Maxim Levitsky 2020-11-23 17:38 ` Kevin Wolf 2020-11-23 18:11 ` Maxim Levitsky 2020-11-24 9:17 ` Kevin Wolf 2020-11-24 18:59 ` Alberto Garcia 2020-11-24 18:59 ` Maxim Levitsky 2020-11-24 19:44 ` Maxim Levitsky 2020-11-25 16:49 ` Alberto Garcia
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.