qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Cc: Denis Lunev <den@virtuozzo.com>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH 1/2] block/raw-format: switch to BDRV_BLOCK_DATA with BDRV_BLOCK_RECURSE
Date: Tue, 13 Aug 2019 16:46:20 +0200	[thread overview]
Message-ID: <6c666a03-e679-c566-e309-5472a2bbef3f@redhat.com> (raw)
In-Reply-To: <20190813132150.GI4663@localhost.localdomain>


[-- Attachment #1.1: Type: text/plain, Size: 4272 bytes --]

On 13.08.19 15:21, Kevin Wolf wrote:
> Am 13.08.2019 um 14:01 hat Kevin Wolf geschrieben:
>> Am 13.08.2019 um 13:28 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>> 13.08.2019 14:04, Kevin Wolf wrote:
>>>> Am 12.08.2019 um 20:11 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>>> BDRV_BLOCK_RAW makes generic bdrv_co_block_status to fallthrough to
>>>>> returned file. But is it correct behavior at all? If returned file
>>>>> itself has a backing file, we may report as totally unallocated and
>>>>> area which actually has data in bottom backing file.
>>>>>
>>>>> So, mirroring of qcow2 under raw-format is broken. Which is illustrated
>>>>> by following commit with a test. Let's make raw-format behave more
>>>>> correctly returning BDRV_BLOCK_DATA.
>>>>>
>>>>> Suggested-by: Max Reitz <mreitz@redhat.com>
>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>
>>>> After some reading, I think I came to the conclusion that RAW is the
>>>> correct thing to do. There is indeed a problem, but this patch is trying
>>>> to fix it in the wrong place.
>>>>
>>>> In the case where the backing file contains some data, and we have a
>>>> 'raw' node above the qcow2 overlay node, the content of the respective
>>>> block is not defined by the queried backing file layer, so it is
>>>> completely correct that bdrv_is_allocated() returns false, like it would
>>>> if you queried the qcow2 layer directly. If it returned true, we would
>>>> copy everything, which isn't right either (the test cases should may add
>>>> the qemu-img map output of the target so this becomes visible).
>>>>
>>>> The problem is that we try to recurse along the backing chain, but we
>>>> fail to make the step from the raw node to the backing file.
>>>
>>> I'd say, the problem is that we ignore backing chain of non-backing
>>> child
>>
>> Yes, exactly. And I know even less about what happens if a child is
>> neither bs->file nor bs->backing. Imagine a qcow2 image with an external
>> data file that is a qcow2 image with a backing file itself. :-)
>>
>> Actually, just having two qcow2 layers nested with bs->file probably
>> already fails.
>>
>>>> Note that just extending Max's "deal with filters" is not enough to fix
>>>> this because raw doesn't actually meet all of the criteria for being a
>>>> filter in this sense (at least because the 'offset' option can change
>>>> offsets between raw and its child).
>>>>
>>>> I think this is essentially a result of special-casing backing files
>>>> everywhere instead of treating them like children like any other.
>>>
>>> But we need to special-case them, as we have interfaces operating on
>>> backing chain,
>>
>> I'm not sure yet if this means that these interfaces are wrong, but it
>> might. But in any case, I think we depend on special-casing in more
>> places than we should.
>>
>>>> bdrv_co_block_status_above() probably shouldn't recurse along the
>>>> backing chain, but along the returned *file pointers, and consider the
>>>> returned offset in *map.
>>>
>>> So, you mean that in case of unallocated, format layer should return
>>> it's backing file as file?
>>
>> Yes, because that's where it's reading the data from.
>>
>> Hm... Now I wonder what this means for DATA... In theory it would have
>> to be set for backing files, but that would make it completely useless.
>> We can distinguish the cases by looking at *file, but how does the
>> generic block layer know which child should be counted as "allocated"
>> and which shouldn't?
> 
> Possible answer to my own question:
> 
> bdrv_is_allocated(bs) isn't even asking a complete question. What we
> really need to ask is whether a specific child is where data comes from.
> 
> What the current callers of bdrv_is_allocated() are interested in is
> whether the data comes from bs->backing or from somewhere else. That is,
> if removing bs from the graph (so that all parents of bs would point to
> bs->backing instead) would still result in the same data in the given
> block.

Maybe callers of bdrv_is_allocated() should just ensure that the node
they pass actually has a backing file.

(If it doesn’t, they should skip all filters until it does.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2019-08-13 14:46 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-12 18:11 [Qemu-devel] [PATCH 0/2] deal with BDRV_BLOCK_RAW Vladimir Sementsov-Ogievskiy
2019-08-12 18:11 ` [Qemu-devel] [PATCH 1/2] block/raw-format: switch to BDRV_BLOCK_DATA with BDRV_BLOCK_RECURSE Vladimir Sementsov-Ogievskiy
2019-08-13 11:04   ` Kevin Wolf
2019-08-13 11:28     ` Vladimir Sementsov-Ogievskiy
2019-08-13 12:01       ` Kevin Wolf
2019-08-13 13:21         ` [Qemu-devel] [Qemu-block] " Kevin Wolf
2019-08-13 14:46           ` Max Reitz [this message]
2019-08-13 14:43     ` [Qemu-devel] " Max Reitz
2019-08-13 14:56       ` Vladimir Sementsov-Ogievskiy
2019-08-13 15:03         ` Max Reitz
2019-08-13 15:22           ` Vladimir Sementsov-Ogievskiy
2019-08-13 16:07             ` Max Reitz
2019-08-13 15:41       ` Kevin Wolf
2019-08-13 15:54         ` Vladimir Sementsov-Ogievskiy
2019-08-13 16:08           ` Kevin Wolf
2019-08-13 16:32             ` Vladimir Sementsov-Ogievskiy
2019-08-14  6:27               ` Vladimir Sementsov-Ogievskiy
2019-08-13 16:21         ` Max Reitz
2019-08-12 18:11 ` [Qemu-devel] [PATCH 2/2] iotests: test mirroring qcow2 under raw format Vladimir Sementsov-Ogievskiy
2019-08-13  9:10   ` Kevin Wolf
2019-08-13  9:22     ` Vladimir Sementsov-Ogievskiy
2019-08-13  9:36       ` Kevin Wolf
2019-08-12 19:46 ` [Qemu-devel] [PATCH 0/2] deal with BDRV_BLOCK_RAW Max Reitz
2019-08-12 19:50   ` Max Reitz
2019-08-13  8:39     ` Vladimir Sementsov-Ogievskiy
2019-08-13  9:01       ` Vladimir Sementsov-Ogievskiy
2019-08-13  9:33         ` Vladimir Sementsov-Ogievskiy
2019-08-13 11:14           ` Vladimir Sementsov-Ogievskiy
2019-08-13 11:51             ` Kevin Wolf
2019-08-13 13:00               ` Vladimir Sementsov-Ogievskiy
2019-08-13 14:31               ` Max Reitz
2019-08-13 14:46                 ` Vladimir Sementsov-Ogievskiy
2019-08-13 14:53                   ` Max Reitz
2019-08-13 15:03                     ` Kevin Wolf
2019-08-13 15:04                       ` Max Reitz
2019-08-13 14:50                 ` Eric Blake
2019-08-13  9:34   ` Kevin Wolf
2019-08-13 14:38     ` Max Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6c666a03-e679-c566-e309-5472a2bbef3f@redhat.com \
    --to=mreitz@redhat.com \
    --cc=den@virtuozzo.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).