All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: John Snow <jsnow@redhat.com>, qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
	pkrempa@redhat.com, Eduardo Habkost <ehabkost@redhat.com>,
	qemu-block@nongnu.org, Markus Armbruster <armbru@redhat.com>,
	Max Reitz <mreitz@redhat.com>, Cleber Rosa <crosa@redhat.com>
Subject: Re: [PATCH 1/6] block: add bitmap-populate job
Date: Wed, 26 Feb 2020 08:07:07 +0300	[thread overview]
Message-ID: <d462c95f-a616-1ddb-f0fd-650be113c384@virtuozzo.com> (raw)
In-Reply-To: <5c03a2b5-7cba-b347-da80-f9d9acaccb6e@redhat.com>

25.02.2020 23:41, John Snow wrote:
> 
> 
> On 2/25/20 11:04 AM, Vladimir Sementsov-Ogievskiy wrote:
>> 25.02.2020 3:56, John Snow wrote:
>>> This job copies the allocation map into a bitmap. It's a job because
>>> there's no guarantee that allocation interrogation will be quick (or
>>> won't hang), so it cannot be retrofit into block-dirty-bitmap-merge.
>>>
>>> It was designed with different possible population patterns in mind,
>>> but only top layer allocation was implemented for now.
>>>
>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>> ---
>>>    qapi/block-core.json      |  48 +++++++++
>>>    qapi/job.json             |   2 +-
>>>    include/block/block_int.h |  21 ++++
>>>    block/bitmap-alloc.c      | 207 ++++++++++++++++++++++++++++++++++++++
>>>    blockjob.c                |   3 +-
>>>    block/Makefile.objs       |   1 +
>>>    6 files changed, 280 insertions(+), 2 deletions(-)
>>>    create mode 100644 block/bitmap-alloc.c
>>>
>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>> index 85e27bb61f..df1797681a 100644
>>> --- a/qapi/block-core.json
>>> +++ b/qapi/block-core.json
>>> @@ -2245,6 +2245,54 @@
>>>          { 'command': 'block-dirty-bitmap-merge',
>>>            'data': 'BlockDirtyBitmapMerge' }
>>>    +##
>>> +# @BitmapPattern:
>>> +#
>>> +# An enumeration of possible patterns that can be written into a bitmap.
>>> +#
>>> +# @allocation-top: The allocation status of the top layer
>>> +#                  of the attached storage node.
>>> +#
>>> +# Since: 5.0
>>> +##
>>> +{ 'enum': 'BitmapPattern',
>>> +  'data': ['allocation-top'] }
>>> +
>>> +##
>>> +# @BlockDirtyBitmapPopulate:
>>> +#
>>> +# @job-id: identifier for the newly-created block job.
>>> +#
>>> +# @pattern: What pattern should be written into the bitmap?
>>> +#
>>> +# @on-error: the action to take if an error is encountered on a bitmap's
>>> +#            attached node, default 'report'.
>>> +#            'stop' and 'enospc' can only be used if the block device
>>> supports
>>> +#            io-status (see BlockInfo).
>>> +#
>>> +# @auto-finalize: When false, this job will wait in a PENDING state
>>> after it has
>>> +#                 finished its work, waiting for @block-job-finalize
>>> before
>>> +#                 making any block graph changes.
>>
>> sounds a bit strange in context of bitmap-population job
>>
> 
> Yeah, you're right. Copy-pasted for "consistency".
> 
>>> +#                 When true, this job will automatically
>>> +#                 perform its abort or commit actions.
>>> +#                 Defaults to true.
>>> +#
>>> +# @auto-dismiss: When false, this job will wait in a CONCLUDED state
>>> after it
>>> +#                has completely ceased all work, and awaits
>>> @block-job-dismiss.
>>> +#                When true, this job will automatically disappear
>>> from the query
>>> +#                list without user intervention.
>>> +#                Defaults to true.
>>> +#
>>> +# Since: 5.0
>>> +##
>>> +{ 'struct': 'BlockDirtyBitmapPopulate',
>>> +  'base': 'BlockDirtyBitmap',
>>> +  'data': { 'job-id': 'str',
>>> +            'pattern': 'BitmapPattern',
>>> +            '*on-error': 'BlockdevOnError',
>>> +            '*auto-finalize': 'bool',
>>> +            '*auto-dismiss': 'bool' } }
>>> +
>>>    ##
>>>    # @BlockDirtyBitmapSha256:
>>>    #
>>> diff --git a/qapi/job.json b/qapi/job.json
>>> index 5e658281f5..5f496d4630 100644
>>> --- a/qapi/job.json
>>> +++ b/qapi/job.json
>>> @@ -22,7 +22,7 @@
>>>    # Since: 1.7
>>>    ##
>>>    { 'enum': 'JobType',
>>> -  'data': ['commit', 'stream', 'mirror', 'backup', 'create'] }
>>> +  'data': ['commit', 'stream', 'mirror', 'backup', 'create',
>>> 'bitmap-populate'] }
>>>      ##
>>>    # @JobStatus:
>>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>>> index 6f9fd5e20e..a5884b597e 100644
>>> --- a/include/block/block_int.h
>>> +++ b/include/block/block_int.h
>>> @@ -1215,6 +1215,27 @@ BlockJob *backup_job_create(const char *job_id,
>>> BlockDriverState *bs,
>>>                                BlockCompletionFunc *cb, void *opaque,
>>>                                JobTxn *txn, Error **errp);
>>>    +/*
>>> + * bitpop_job_create: Create a new bitmap population job.
>>> + *
>>> + * @job_id: The id of the newly-created job.
>>> + * @bs: Block device associated with the @target_bitmap.
>>> + * @target_bitmap: The bitmap to populate.
>>> + * @on_error: What to do if an error on @bs is encountered.
>>> + * @creation_flags: Flags that control the behavior of the Job lifetime.
>>> + *                  See @BlockJobCreateFlags
>>> + * @cb: Completion function for the job.
>>> + * @opaque: Opaque pointer value passed to @cb.
>>> + * @txn: Transaction that this job is part of (may be NULL).
>>> + */
>>> +BlockJob *bitpop_job_create(const char *job_id, BlockDriverState *bs,
>>> +                            BdrvDirtyBitmap *target_bitmap,
>>> +                            BitmapPattern pattern,
>>> +                            BlockdevOnError on_error,
>>> +                            int creation_flags,
>>> +                            BlockCompletionFunc *cb, void *opaque,
>>> +                            JobTxn *txn, Error **errp);
>>> +
>>>    void hmp_drive_add_node(Monitor *mon, const char *optstr);
>>>      BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
>>> diff --git a/block/bitmap-alloc.c b/block/bitmap-alloc.c
>>> new file mode 100644
>>> index 0000000000..47d542dc12
>>> --- /dev/null
>>> +++ b/block/bitmap-alloc.c
>>> @@ -0,0 +1,207 @@
>>> +/*
>>> + * Async Dirty Bitmap Populator
>>> + *
>>> + * Copyright (C) 2020 Red Hat, Inc.
>>> + *
>>> + * Authors:
>>> + *  John Snow <jsnow@redhat.com>
>>> + *
>>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>>> later.
>>> + * See the COPYING file in the top-level directory.
>>> + *
>>> + */
>>> +
>>> +#include "qemu/osdep.h"
>>> +
>>> +#include "trace.h"
>>> +#include "block/block.h"
>>> +#include "block/block_int.h"
>>> +#include "block/blockjob_int.h"
>>> +#include "block/block_backup.h"
>>> +#include "block/block-copy.h"
>>
>> I hope, not all includes are needed :)
> 
> Whoops, no, of course not. I copied the skeleton from backup, as you can
> tell ;)
> 
>>
>>> +#include "qapi/error.h"
>>> +#include "qapi/qmp/qerror.h"
>>> +#include "qemu/ratelimit.h"
>>> +#include "qemu/cutils.h"
>>> +#include "sysemu/block-backend.h"
>>> +#include "qemu/bitmap.h"
>>> +#include "qemu/error-report.h"
>>> +
>>> +typedef struct BitpopBlockJob {
>>> +    BlockJob common;
>>> +    BlockDriverState *bs;
>>> +    BdrvDirtyBitmap *target_bitmap;
>>> +    BdrvDirtyBitmap *new_bitmap;
>>> +    BlockdevOnError on_error;
>>> +    uint64_t len;
>>> +} BitpopBlockJob;
>>> +
>>> +static const BlockJobDriver bitpop_job_driver;
>>> +
>>> +static void bitpop_commit(Job *job)
>>> +{
>>> +    BitpopBlockJob *s = container_of(job, BitpopBlockJob, common.job);
>>> +
>>> +    bdrv_dirty_bitmap_merge_internal(s->target_bitmap, s->new_bitmap,
>>> +                                     NULL, true);
>>
>> Hmm, so you populate new_bitmap, and then merge to target. Why can't we
>> work
>> directly with target bitmap? The most probable case is that libvirt will
>> create bitmap specifically to use as target in this operation, or not?
>>
> 
> Most likely case, yes. Odds are very good it will be a brand new bitmap.
> 
> However, we already have a creation command -- I didn't want to make a
> second job-version of the command and then maintain two interfaces, so I
> made it a "merge into existing" style command instead.
> 
>> Hmm, just to make it possible to cancel the job and keep the target
>> bitmap in
>> original state? Is it really needed? I think on failure target bitmap
>> will be
>> removed anyway..
>>
> 
> You caught me being *lazy*. I copy the bitmap so I can unconditionally
> enable it to catch in-flight writes without having to create block graph
> modifications.
> 
> But, yes, to undo changes if we cancel.
> 
> I didn't want to make a job that was not able to be canceled. The
> alternative is pursuing the design where we allow new bitmaps only --
> because then on cancel we can just delete them.

On backup job (and any other) we can't rollback target changes on cancel.
So, I think it would be OK to take same semantics for the new job, keeping in
mind that it would be most probable usage case and no sense in creating
additional bitmaps. And if caller needs to use existent non-empty bitmap as
target and wants correct cancel, it always can create additional bitmap by
itself and then merge it to actual target.

And why new? It's up to user, what to use as target. And user knows, that on
failure or cancel target becomes invalid and will workaround this if needed.

> 
>>> +}
>>> +
>>> +/* no abort needed; just clean without committing. */
>>> +
>>> +static void bitpop_clean(Job *job)
>>> +{
>>> +    BitpopBlockJob *s = container_of(job, BitpopBlockJob, common.job);
>>> +
>>> +    bdrv_release_dirty_bitmap(s->new_bitmap);
>>> +    bdrv_dirty_bitmap_set_busy(s->target_bitmap, false);
>>> +}
>>> +
>>> +static BlockErrorAction bitpop_error_action(BitpopBlockJob *job, int
>>> error)
>>> +{
>>> +    return block_job_error_action(&job->common, job->on_error, true,
>>> error);
>>> +}
>>> +
>>> +static bool coroutine_fn yield_and_check(Job *job)
>>> +{
>>> +    if (job_is_cancelled(job)) {
>>> +        return true;
>>> +    }
>>> +
>>> +    job_sleep_ns(job, 0);
>>> +
>>> +    if (job_is_cancelled(job)) {
>>> +        return true;
>>> +    }
>>> +
>>> +    return false;
>>> +}
>>> +
>>> +static int coroutine_fn bitpop_run(Job *job, Error **errp)
>>> +{
>>> +    BitpopBlockJob *s = container_of(job, BitpopBlockJob, common.job);
>>> +    int ret = 0;
>>> +    int64_t offset;
>>> +    int64_t count;
>>> +    int64_t bytes;
>>> +
>>> +    for (offset = 0; offset < s->len; ) {
>>> +        if (yield_and_check(job)) {
>>> +            ret = -ECANCELED;
>>> +            break;
>>> +        }
>>> +
>>> +        bytes = s->len - offset;
>>> +        ret = bdrv_is_allocated(s->bs, offset, bytes, &count);
>>> +        if (ret < 0) {
>>> +            if (bitpop_error_action(s, -ret) ==
>>> BLOCK_ERROR_ACTION_REPORT) {
>>> +                break;
>>> +            }
>>> +            continue;
>>> +        }
>>> +
>>> +        if (!count) {
>>> +            ret = 0;
>>
>> Hmm, I think it's impossible case.. If so, better to make an assertion
>> or ignore..
>>
> 
> OK, agreed.
> 
>>> +            break;
>>> +        }
>>> +
>>> +        if (ret) {
>>> +            bdrv_set_dirty_bitmap(s->new_bitmap, offset, count);
>>> +            ret = 0;
>>> +        }
>>> +
>>> +        job_progress_update(job, count);
>>> +        offset += count;
>>> +    }
>>> +
>>> +    return ret;
>>> +}
>>> +
>>> +static const BlockJobDriver bitpop_job_driver = {
>>> +    .job_driver = {
>>> +        .instance_size          = sizeof(BitpopBlockJob),
>>> +        .job_type               = JOB_TYPE_BITMAP_POPULATE,
>>> +        .free                   = block_job_free,
>>> +        .user_resume            = block_job_user_resume,
>>> +        .run                    = bitpop_run,
>>> +        .commit                 = bitpop_commit,
>>> +        .clean                  = bitpop_clean,
>>> +    }
>>> +};
>>> +
>>> +
>>> +BlockJob *bitpop_job_create(
>>> +    const char *job_id,
>>> +    BlockDriverState *bs,
>>> +    BdrvDirtyBitmap *target_bitmap,
>>> +    BitmapPattern pattern,
>>> +    BlockdevOnError on_error,
>>> +    int creation_flags,
>>> +    BlockCompletionFunc *cb,
>>> +    void *opaque,
>>> +    JobTxn *txn,
>>> +    Error **errp)
>>> +{
>>> +    int64_t len;
>>> +    BitpopBlockJob *job = NULL;
>>> +    int64_t cluster_size;
>>> +    BdrvDirtyBitmap *new_bitmap = NULL;
>>> +
>>> +    assert(bs);
>>> +    assert(target_bitmap);
>>> +
>>> +    if (!bdrv_is_inserted(bs)) {
>>> +        error_setg(errp, "Device is not inserted: %s",
>>> +                   bdrv_get_device_name(bs));
>>> +        return NULL;
>>> +    }
>>
>> Why this?
>>
> 
> I assumed there was nothing to read the allocation map *of* if there
> wasn't a media present.
> 
> Am I mistaken?

is_inserted checks existing of bs->drv, but bitmap operations actually
doesn't need any drv.. Hmm. I'm not against this check anyway.

> 
>>> +
>>> +    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
>>> +        return NULL;
>>> +    }
>>
>> and this?
>>
> 
> Copy-paste: I don't understand if I want a new op blocker, to re-use an
> op-blocker, or to have no op blocker.
> 
> Genuinely I have no idea. I should left a review comment here, I forgot
> about this part, sorry.

I'm for no op blocker. As I understand, op-blockers are old and should be
replaced by permissions and frozen children. So, if we don't know, do we
need any blocking here, better go on without it. Also, op-blockers can't
block forbidden usage through filters anyway.

> 
>>> +
>>> +    if (bdrv_dirty_bitmap_check(target_bitmap, BDRV_BITMAP_DEFAULT,
>>> errp)) {
>>> +        return NULL;
>>> +    }
>>> +
>>> +    if (pattern != BITMAP_PATTERN_ALLOCATION_TOP) {
>>> +        error_setg(errp, "Unrecognized bitmap pattern");
>>> +        return NULL;
>>> +    }
>>> +
>>> +    len = bdrv_getlength(bs);
>>> +    if (len < 0) {
>>> +        error_setg_errno(errp, -len, "unable to get length for '%s'",
>>> +                         bdrv_get_device_name(bs));
>>> +        return NULL;
>>> +    }
>>> +
>>> +    /* NB: new bitmap is anonymous and enabled */
>>> +    cluster_size = bdrv_dirty_bitmap_granularity(target_bitmap);
>>> +    new_bitmap = bdrv_create_dirty_bitmap(bs, cluster_size, NULL, errp);
>>> +    if (!new_bitmap) {
>>> +        return NULL;
>>> +    }
>>> +
>>> +    /* Take ownership; we reserve the right to write into this
>>> on-commit. */
>>> +    bdrv_dirty_bitmap_set_busy(target_bitmap, true);
>>
>> Honestly, I still have bad understanding about how should we use dirty
>> bitmap mutex,
>> but note that bdrv_dirty_bitmap_set_busy locks the mutex. And it is (may
>> be) possible,
>> that busy status of the bitmap is changed after bdrv_dirty_bitmap_check
>> but before
>> bdrv_dirty_bitmap_set_busy.  So, more correct would be do both operation
>> under one
>> critical section. Still, I don't know is the situation possible.
>>
> 
> Aren't we under the BQL here? Can we be pre-empted? :(

Seems we are. But, as it's said above dirty_bitmap_mutex declaration:

     /* Writing to the list requires the BQL _and_ the dirty_bitmap_mutex.
      * Reading from the list can be done with either the BQL or the
      * dirty_bitmap_mutex.  Modifying a bitmap only requires
      * dirty_bitmap_mutex.  */

It means, that another thread may modify bitmap (for example its 'busy' field)
taking only dirty_bitmap_mutex, which will lead to the case I described.

> 
>>> +
>>> +    job = block_job_create(job_id, &bitpop_job_driver, txn, bs,
>>> +                           BLK_PERM_CONSISTENT_READ,
>>
>> Do we need it? We are not going to read..
>>
> 
> Copy-paste / leftover from an earlier draft where I was trying to
> achieve atomicity. It can be removed if we don't want the stricter
> atomicity.
> 
>>> +                           BLK_PERM_ALL & ~BLK_PERM_RESIZE,
>>> +                           0, creation_flags,
>>> +                           cb, opaque, errp);
>>> +    if (!job) {
>>> +        bdrv_dirty_bitmap_set_busy(target_bitmap, false);
>>> +        bdrv_release_dirty_bitmap(new_bitmap);
>>> +        return NULL;
>>> +    }
>>> +
>>> +    job->bs = bs;
>>> +    job->on_error = on_error;
>>> +    job->target_bitmap = target_bitmap;
>>> +    job->new_bitmap = new_bitmap;
>>> +    job->len = len;
>>> +    job_progress_set_remaining(&job->common.job, job->len);
>>> +
>>> +    return &job->common;
>>> +}
>>> diff --git a/blockjob.c b/blockjob.c
>>> index 5d63b1e89d..7e450372bd 100644
>>> --- a/blockjob.c
>>> +++ b/blockjob.c
>>> @@ -56,7 +56,8 @@ static bool is_block_job(Job *job)
>>>        return job_type(job) == JOB_TYPE_BACKUP ||
>>>               job_type(job) == JOB_TYPE_COMMIT ||
>>>               job_type(job) == JOB_TYPE_MIRROR ||
>>> -           job_type(job) == JOB_TYPE_STREAM;
>>> +           job_type(job) == JOB_TYPE_STREAM ||
>>> +           job_type(job) == JOB_TYPE_BITMAP_POPULATE;
>>>    }
>>>      BlockJob *block_job_next(BlockJob *bjob)
>>> diff --git a/block/Makefile.objs b/block/Makefile.objs
>>> index 3bcb35c81d..f3cfc89d90 100644
>>> --- a/block/Makefile.objs
>>> +++ b/block/Makefile.objs
>>> @@ -36,6 +36,7 @@ block-obj-$(CONFIG_LIBSSH) += ssh.o
>>>    block-obj-y += accounting.o dirty-bitmap.o
>>>    block-obj-y += write-threshold.o
>>>    block-obj-y += backup.o
>>> +block-obj-y += bitmap-alloc.o
>>>    block-obj-$(CONFIG_REPLICATION) += replication.o
>>>    block-obj-y += throttle.o copy-on-read.o
>>>    block-obj-y += block-copy.o
>>>
>>
>>
> 
> Thanks for the review. I'll start making changes, but won't send V2 just
> yet.
> 
> --js
> 


-- 
Best regards,
Vladimir


  reply	other threads:[~2020-02-26  5:08 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-25  0:56 [PATCH 0/6] block: add block-dirty-bitmap-populate job John Snow
2020-02-25  0:56 ` [PATCH 1/6] block: add bitmap-populate job John Snow
2020-02-25 16:04   ` Vladimir Sementsov-Ogievskiy
2020-02-25 20:41     ` John Snow
2020-02-26  5:07       ` Vladimir Sementsov-Ogievskiy [this message]
2020-02-26 19:11         ` John Snow
2020-02-27  6:11           ` Vladimir Sementsov-Ogievskiy
2020-03-03 21:39             ` John Snow
2020-02-25  0:56 ` [PATCH 2/6] qmp: expose block-dirty-bitmap-populate John Snow
2020-02-27 10:44   ` Vladimir Sementsov-Ogievskiy
2020-03-03 21:48     ` John Snow
2020-02-25  0:56 ` [PATCH 3/6] iotests: move bitmap helpers into their own file John Snow
2020-02-27 10:54   ` Vladimir Sementsov-Ogievskiy
2020-03-03 21:55     ` John Snow
2020-02-25  0:56 ` [PATCH 4/6] iotests: add hmp helper with logging John Snow
2020-02-27 10:57   ` Vladimir Sementsov-Ogievskiy
2020-02-25  0:56 ` [PATCH 5/6] qmp.py: change event_wait to use a dict John Snow
2020-02-27 11:25   ` Vladimir Sementsov-Ogievskiy
2020-03-03 21:58     ` John Snow
2020-02-25  0:56 ` [PATCH 6/6] iotests: add 287 for block-dirty-bitmap-populate John Snow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d462c95f-a616-1ddb-f0fd-650be113c384@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=armbru@redhat.com \
    --cc=crosa@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pkrempa@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.