Re: [PATCH v3 09/19] block: introduce FleecingState class - Vladimir Sementsov-Ogievskiy

From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: Hanna Reitz <hreitz@redhat.com>, qemu-block@nongnu.org
Cc: qemu-devel@nongnu.org, armbru@redhat.com,
	xiechanglong.d@gmail.com, wencongyang2@huawei.com,
	eblake@redhat.com, kwolf@redhat.com, jsnow@redhat.com,
	nikita.lapshin@virtuozzo.com
Subject: Re: [PATCH v3 09/19] block: introduce FleecingState class
Date: Tue, 18 Jan 2022 21:35:13 +0300	[thread overview]
Message-ID: <0fafb39a-d104-36cc-5056-720e558bb042@virtuozzo.com> (raw)
In-Reply-To: <84ca9222-faee-de23-d8e7-06c39c938e9e@redhat.com>

18.01.2022 19:37, Hanna Reitz wrote:
> On 22.12.21 18:40, Vladimir Sementsov-Ogievskiy wrote:
>> FleecingState represents state shared between copy-before-write filter
>> and upcoming fleecing block driver.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/fleecing.h  | 135 ++++++++++++++++++++++++++++++++++
>>   block/fleecing.c  | 182 ++++++++++++++++++++++++++++++++++++++++++++++
>>   MAINTAINERS       |   2 +
>>   block/meson.build |   1 +
>>   4 files changed, 320 insertions(+)
>>   create mode 100644 block/fleecing.h
>>   create mode 100644 block/fleecing.c
>>
>> diff --git a/block/fleecing.h b/block/fleecing.h
>> new file mode 100644
>> index 0000000000..fb7b2f86c4
>> --- /dev/null
>> +++ b/block/fleecing.h
>> @@ -0,0 +1,135 @@
>> +/*
>> + * FleecingState
>> + *
>> + * The common state of image fleecing, shared between copy-before-write filter
>> + * and fleecing block driver.
> 
>  From this documentation, it’s unclear to me who owns the FleecingState object.  I would have assumed it’s the fleecing node, and if it is, I wonder why we even have this external interface instead of considering FleecingState a helper object for the fleecing block driver (or rather the block driver’s opaque state, which it basically is, as far as I can see from peeking into the next patch), and putting both into a single file with no external interface except for fleecing_mark_done_and_wait_readers().

FleecingState object is owned by copy-before-write node. copy-before-write has the whole information, and it owns BlockCopyState object, which is used to create FleecingState. copy-before-write node can easily detect that its target is fleecing filter, and initialize FleecingState in this case.

On the other hand, if we want to create FleecingState from fleecing filter (or even merge the state into its driver state), we'll have to search through parents to find copy-before-write, which may be not trivial. Moreover, at time of open() we may have no parents yet.

Hmm, but may be just pass bcs to fleecing-node by activate(), like we are going to do with fleecing state?  I'll give it a try.

> 
>> + *
>> + * Copyright (c) 2021 Virtuozzo International GmbH.
>> + *
>> + * Author:
>> + *  Sementsov-Ogievskiy Vladimir <vsementsov@virtuozzo.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program. If not, see <http://www.gnu.org/licenses/>.
>> + *
>> + *
>> + * Fleecing scheme looks as follows:
>> + *
>> + * [guest blk]                   [nbd export]
>> + *    |                              |
>> + *    |root                          |
>> + *    v                              v
>> + * [copy-before-write]--target-->[fleecing drv]
>> + *    |                          /   |
>> + *    |file                     /    |file
>> + *    v                        /     v
>> + * [active disk]<--source-----/  [temp disk]
>> + *
>> + * Note that "active disk" is also called just "source" and "temp disk" is also
>> + * called "target".
>> + *
>> + * What happens here:
>> + *
>> + * copy-before-write filter performs copy-before-write operations: on guest
>> + * write we should copy old data to target child before rewriting. Note that we
>> + * write this data through fleecing driver: it saves a possibility to implement
>> + * a kind of cache in fleecing driver in future.
> 
> I don’t understand why this explanation is the first one given (and the only one given explicitly as a reason) for why we want a fleecing block driver.

Actually, benefits are given in the next commit message.

> 
> (1) If we implement caching later, I have a feeling that we’ll want new options for this.  So a management layer that wants caching will need to be updated at that point anyway (to specify these new options), so I don’t understand how adding a fleecing block driver now would make it easier later on to introduce caching.
> 
> (1b) It’s actually entirely possible that we will not want to use the fleecing driver for caching, because we decide that caching is much more useful as its own dedicated block driver.
> 
> (2) There are much better arguments below.  This FleecingState you introduce here makes it clear why we need a fleecing block driver; it helps with synchronization, and it provides the “I’m done with this bit, I don’t care about it anymore” discard interface.
> 
>> + *
>> + * Fleecing user is nbd export: it can read from fleecing node, which guarantees
>> + * a snapshot-view for fleecing user. Fleecing user may also do discard
>> + * operations.
>> + *
>> + * FleecingState is responsible for most of the fleecing logic:
>> + *
>> + * 1. Fleecing read. Handle reads of fleecing user: we should decide where from
>> + * to read, from source node or from copy-before-write target node. In former
>> + * case we need to synchronize with guest writes. See fleecing_read_lock() and
>> + * fleecing_read_unlock() functionality.
>> + *
>> + * 2. Guest write synchronization (part of [1] actually). See
>> + * fleecing_mark_done_and_wait_readers()
>> + *
>> + * 3. Fleecing discard. Used by fleecing user when corresponding area is already
>> + * copied. Fleecing user may discard the area which is not needed anymore, that
>> + * should result in:
>> + *   - discarding data to free disk space
>> + *   - clear bits in copy-bitmap of block-copy, to avoid extra copy-before-write
>> + *     operations
>> + *   - clear bits in access-bitmap of FleecingState, to avoid further wrong
>> + *     access
>> + *
>> + * Still, FleecingState doesn't own any block children, so all real io
>> + * operations (reads, writes and discards) are done by copy-before-write filter
>> + * and fleecing block driver.
> 
> I find this a bit confusing, because for me, it raised the question of “why would it own block childen?”, which led to me wanting to know even more where the place of FleecingState is.  This sentence makes it really sound as if FleecingState is its own independent object floating around somewhere, not owned by anything, and that feels very wrong.

It's owned by copy-before-write node. Hmm, and seems doesn't operate directly on any block children, so this sentence may be removed.

-- 
Best regards,
Vladimir