All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: "qemu-block@nongnu.org" <qemu-block@nongnu.org>
Cc: "kwolf@redhat.com" <kwolf@redhat.com>,
	"fam@euphon.net" <fam@euphon.net>,
	Denis Lunev <den@virtuozzo.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"mreitz@redhat.com" <mreitz@redhat.com>,
	"stefanha@redhat.com" <stefanha@redhat.com>
Subject: Re: [PATCH 0/4] fix & merge block_status_above and is_allocated_above
Date: Wed, 20 Nov 2019 10:20:10 +0000	[thread overview]
Message-ID: <7f8574a2-8fd2-9724-a197-d67d3c69d538@virtuozzo.com> (raw)
In-Reply-To: <20191116163410.12129-1-vsementsov@virtuozzo.com>

16.11.2019 19:34, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
> 
> I wanted to understand, what is the real difference between bdrv_block_status_above
> and bdrv_is_allocated_above, IMHO bdrv_is_allocated_above should work through
> bdrv_block_status_above..
> 
> And I found the problem: bdrv_is_allocated_above considers space after EOF as
> UNALLOCATED for intermediate nodes..
> 
> UNALLOCATED is not about allocation at fs level, but about should we go to backing or
> not.. And it seems incorrect for me, as in case of short backing file, we'll read
> zeroes after EOF, instead of going further by backing chain.
> 
> This leads to the following effect:
> 
> ./qemu-img create -f qcow2 base.qcow2 2M
> ./qemu-io -c "write -P 0x1 0 2M" base.qcow2
> 
> ./qemu-img create -f qcow2 -b base.qcow2 mid.qcow2 1M
> ./qemu-img create -f qcow2 -b mid.qcow2 top.qcow2 2M
> 
> Region 1M..2M is shadowed by short middle image, so guest sees zeroes:
> ./qemu-io -c "read -P 0 1M 1M" top.qcow2
> read 1048576/1048576 bytes at offset 1048576
> 1 MiB, 1 ops; 00.00 sec (22.795 GiB/sec and 23341.5807 ops/sec)
> 
> But after commit guest visible state is changed, which seems wrong for me:
> ./qemu-img commit top.qcow2 -b mid.qcow2
> 
> ./qemu-io -c "read -P 0 1M 1M" mid.qcow2
> Pattern verification failed at offset 1048576, 1048576 bytes
> read 1048576/1048576 bytes at offset 1048576
> 1 MiB, 1 ops; 00.00 sec (4.981 GiB/sec and 5100.4794 ops/sec)
> 
> ./qemu-io -c "read -P 1 1M 1M" mid.qcow2
> read 1048576/1048576 bytes at offset 1048576
> 1 MiB, 1 ops; 00.00 sec (3.365 GiB/sec and 3446.1606 ops/sec)
> 
> 
> I don't know, is it a real bug, as I don't know, do we support backing file larger than
> its parent. Still, I'm not sure that this behavior of bdrv_is_allocated_above don't lead
> to other problems.
> 
> =====
> 
> Hmm, bdrv_block_allocated_above behaves strange too:
> 
> with want_zero=true, it may report unallocated zeroes because of short backing files, which
> are actually "allocated" in POV of backing chains. But I see this may influence only
> qemu-img compare, and I don't see can it trigger some bug..
> 
> with want_zero=false, it may do no progress because of short backing file. Moreover it may
> report EOF in the middle!! But want_zero=false used only in bdrv_is_allocated, which considers
> onlyt top layer, so it seems OK.
> 
> =====
> 
> So, I propose these series, still I'm not sure is there a real bug.
> 
> Vladimir Sementsov-Ogievskiy (4):
>    block/io: fix bdrv_co_block_status_above
>    block/io: bdrv_common_block_status_above: support include_base
>    block/io: bdrv_common_block_status_above: support bs == base
>    block/io: fix bdrv_is_allocated_above
> 
>   block/io.c                 | 104 ++++++++++++++++++-------------------
>   tests/qemu-iotests/154.out |   4 +-
>   2 files changed, 53 insertions(+), 55 deletions(-)
> 


Interesting that the problem illustrated here is not fixed by the series, it's actually
relates to the fact that mirror does truncation with PREALLOC_MODE_OFF, which leads
to unallocated qcow2 clusters, which I think should be fixed too.

To illustrate the problem fixed by the series, we should commit to base:

# ./qemu-img commit top.qcow2 -b base.qcow2
Image committed.
# ./qemu-io -c "read -P 0 1M 1M" base.qcow2
Pattern verification failed at offset 1048576, 1048576 bytes
read 1048576/1048576 bytes at offset 1048576
1 MiB, 1 ops; 00.00 sec (5.366 GiB/sec and 5494.4149 ops/sec)

=======

Hmm, but how to fix the problem about truncate? I think truncate must not make
underlying backing available for read.. Discard operation doesn't do it.

So, actually on PREALLOC_MODE_OFF we must allocated L2 tables and mark new clusters
as ZERO?

To improve it, we can add "zero bit" to L1 table entry..

-- 
Best regards,
Vladimir

  parent reply	other threads:[~2019-11-20 10:21 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-16 16:34 [PATCH 0/4] fix & merge block_status_above and is_allocated_above Vladimir Sementsov-Ogievskiy
2019-11-16 16:34 ` [PATCH 1/4] block/io: fix bdrv_co_block_status_above Vladimir Sementsov-Ogievskiy
2019-11-25 16:00   ` Kevin Wolf
2019-11-26  7:26     ` Vladimir Sementsov-Ogievskiy
2019-11-26 14:20       ` Kevin Wolf
2019-11-16 16:34 ` [PATCH 2/4] block/io: bdrv_common_block_status_above: support include_base Vladimir Sementsov-Ogievskiy
2019-11-25 16:19   ` Kevin Wolf
2019-11-16 16:34 ` [PATCH 3/4] block/io: bdrv_common_block_status_above: support bs == base Vladimir Sementsov-Ogievskiy
2019-11-25 16:23   ` Kevin Wolf
2019-11-16 16:34 ` [PATCH 4/4] block/io: fix bdrv_is_allocated_above Vladimir Sementsov-Ogievskiy
2019-11-19 10:22 ` [PATCH 0/4] fix & merge block_status_above and is_allocated_above Max Reitz
2019-11-19 12:02   ` Denis V. Lunev
2019-11-19 12:12     ` Vladimir Sementsov-Ogievskiy
2019-11-19 12:20     ` Max Reitz
2019-11-19 12:30       ` Vladimir Sementsov-Ogievskiy
2019-11-19 13:28         ` Kevin Wolf
2019-11-19 12:05 ` Kevin Wolf
2019-11-19 12:17   ` Vladimir Sementsov-Ogievskiy
2019-11-19 12:32     ` Vladimir Sementsov-Ogievskiy
2019-11-19 12:34       ` Vladimir Sementsov-Ogievskiy
2019-11-19 12:49         ` Vladimir Sementsov-Ogievskiy
2019-11-19 14:21     ` Kevin Wolf
2019-11-19 14:54 ` Kevin Wolf
2019-11-19 16:58 ` Stefan Hajnoczi
2019-11-19 17:11   ` Vladimir Sementsov-Ogievskiy
2019-11-20 10:20 ` Vladimir Sementsov-Ogievskiy [this message]
2019-11-20 11:44   ` Kevin Wolf
2019-11-20 12:04     ` Vladimir Sementsov-Ogievskiy
2019-11-20 13:30       ` Kevin Wolf
2019-11-20 13:51         ` Vladimir Sementsov-Ogievskiy
2019-11-20 13:37       ` Vladimir Sementsov-Ogievskiy
2019-11-20 16:24 ` [PATCH 5/4] iotests: add commit top->base cases to 274 Vladimir Sementsov-Ogievskiy
2019-11-25 10:08 ` [PATCH 0/4] fix & merge block_status_above and is_allocated_above Vladimir Sementsov-Ogievskiy
2019-11-25 15:46   ` Kevin Wolf
2019-11-26  7:27     ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7f8574a2-8fd2-9724-a197-d67d3c69d538@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=den@virtuozzo.com \
    --cc=fam@euphon.net \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.