All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org
Cc: armbru@redhat.com, pbonzini@redhat.com, mreitz@redhat.com,
	kwolf@redhat.com, den@openvz.org
Subject: Re: [Qemu-devel] [PATCH v5 4/6] nbd/server: implement dirty bitmap export
Date: Wed, 20 Jun 2018 06:24:08 -0500	[thread overview]
Message-ID: <4ab9dd4e-1c79-fbe2-8e4b-6563e3775366@redhat.com> (raw)
In-Reply-To: <20180609151758.17343-5-vsementsov@virtuozzo.com>

On 06/09/2018 10:17 AM, Vladimir Sementsov-Ogievskiy wrote:
> Handle new NBD meta namespace: "qemu", and corresponding queries:

s/new/a new/

> "qemu:dirty-bitmap:<export bitmap name>".
> 
> With new metadata context negotiated, BLOCK_STATUS query will reply

s/new/the new/

> with dirty-bitmap data, converted to extents. New public function

s/New/The new/

> nbd_export_bitmap selects bitmap to export. For now, only one bitmap

s/bitmap/which bitmap/

> may be exported.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   include/block/nbd.h |   6 ++
>   nbd/server.c        | 271 ++++++++++++++++++++++++++++++++++++++++++++++++----
>   2 files changed, 259 insertions(+), 18 deletions(-)
> 
> diff --git a/include/block/nbd.h b/include/block/nbd.h
> index fcdcd54502..a653d0fba7 100644
> --- a/include/block/nbd.h
> +++ b/include/block/nbd.h
> @@ -234,6 +234,10 @@ enum {
>   #define NBD_STATE_HOLE (1 << 0)
>   #define NBD_STATE_ZERO (1 << 1)
>   
> +/* Flags for extents (NBDExtent.flags) of NBD_REPLY_TYPE_BLOCK_STATUS,
> + * for qemu:dirty-bitmap:* meta contexts */

Comment tail.

> +#define NBD_STATE_DIRTY (1 << 0)
> +
>   static inline bool nbd_reply_type_is_error(int type)
>   {
>       return type & (1 << 15);
> @@ -315,6 +319,8 @@ void nbd_client_put(NBDClient *client);
>   void nbd_server_start(SocketAddress *addr, const char *tls_creds,
>                         Error **errp);
>   
> +void nbd_export_bitmap(NBDExport *exp, const char *bitmap,
> +                       const char *bitmap_export_name, Error **errp);
>   
>   /* nbd_read
>    * Reads @size bytes from @ioc. Returns 0 on success.
> diff --git a/nbd/server.c b/nbd/server.c
> index 2d762d7289..569e068fc2 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -23,6 +23,12 @@
>   #include "nbd-internal.h"
>   
>   #define NBD_META_ID_BASE_ALLOCATION 0
> +#define NBD_META_ID_DIRTY_BITMAP 1
> +
> +/* NBD_MAX_BITMAP_EXTENTS: 1 mb of extents data. An empirical constant. If need
> + * to increase, note that NBD protocol defines 32 mb as a limit, after which the

s/need to increase/an increase is needed/

> + * reply may be considered as a denial of service attack. */
> +#define NBD_MAX_BITMAP_EXTENTS (0x100000 / 8)
>   
>   static int system_errno_to_nbd_errno(int err)
>   {
> @@ -80,6 +86,9 @@ struct NBDExport {
>   
>       BlockBackend *eject_notifier_blk;
>       Notifier eject_notifier;
> +
> +    BdrvDirtyBitmap *export_bitmap;
> +    char *export_bitmap_context;
>   };
>   
>   static QTAILQ_HEAD(, NBDExport) exports = QTAILQ_HEAD_INITIALIZER(exports);
> @@ -92,6 +101,7 @@ typedef struct NBDExportMetaContexts {
>       bool valid; /* means that negotiation of the option finished without
>                      errors */
>       bool base_allocation; /* export base:allocation context (block status) */
> +    bool dirty_bitmap; /* export qemu:dirty-bitmap:<export bitmap name> */
>   } NBDExportMetaContexts;
>   
>   struct NBDClient {
> @@ -810,6 +820,55 @@ static int nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
>                                        &meta->base_allocation, errp);
>   }
>   
> +/* nbd_meta_bitmap_query
> + *
> + * Handle query to 'qemu:' namespace.
> + * @len is the amount of text remaining to be read from the current name, after
> + * the 'qemu:' portion has been stripped.
> + *
> + * Return -errno on I/O error, 0 if option was completely handled by
> + * sending a reply about inconsistent lengths, or 1 on success. */
> +static int nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
> +                               uint32_t len, Error **errp)
> +{
> +    bool dirty_bitmap = false;
> +    int dirty_bitmap_len = strlen("dirty-bitmap:");

size_t is better for strlen() values.

> +    int ret;
> +
> +    if (!meta->exp->export_bitmap) {
> +        return nbd_opt_skip(client, len, errp);
> +    }

Worth a trace?...

> +
> +    if (len == 0) {
> +        if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
> +            meta->dirty_bitmap = true;
> +        }
> +        trace_nbd_negotiate_meta_query_parse("empty");
> +        return 1;
> +    }
> +
> +    if (len < dirty_bitmap_len) {
> +        trace_nbd_negotiate_meta_query_skip("not dirty-bitmap:");
> +        return nbd_opt_skip(client, len, errp);
> +    }
> +

...especially since other returns have traced the result.

I'm strongly thinking of adding a patch to QMP to add an x-context 
parameter when creating an NBD client, in order to at least make testing 
client/server interactions easier than just code inspection.  Does not 
have to be part of this series.

> +    len -= dirty_bitmap_len;
> +    ret = nbd_meta_pattern(client, "dirty-bitmap:", &dirty_bitmap, errp);
> +    if (ret <= 0) {
> +        return ret;
> +    }
> +    if (!dirty_bitmap) {
> +        trace_nbd_negotiate_meta_query_skip("not dirty-bitmap:");
> +        return nbd_opt_skip(client, len, errp);
> +    }
> +
> +    trace_nbd_negotiate_meta_query_parse("dirty-bitmap:");
> +
> +    return nbd_meta_empty_or_pattern(
> +            client, meta->exp->export_bitmap_context +
> +            strlen("qemu:dirty_bitmap:"), len, &meta->dirty_bitmap, errp);
> +}
> +
>   /* nbd_negotiate_meta_query
>    *
>    * Parse namespace name and call corresponding function to parse body of the
> @@ -826,8 +885,10 @@ static int nbd_negotiate_meta_query(NBDClient *client,
>                                       NBDExportMetaContexts *meta, Error **errp)
>   {
>       int ret;
> -    char query[sizeof("base:") - 1];
> -    size_t baselen = strlen("base:");
> +    size_t ns_len = 5;
> +    char ns[5]; /* Both 'qemu' and 'base' namespaces have length = 5 including a
> +                   colon. If it's needed to introduce a namespace of the other
> +                   length, this should be certainly refactored. */

s/be certainly/certainly be/

...

> @@ -1793,6 +1871,9 @@ static int blockstatus_to_extent_be(BlockDriverState *bs, uint64_t offset,
>   }
>   
>   /* nbd_co_send_extents
> + *
> + * NBD_REPLY_FLAG_DONE is not set, don't forget to send it.
> + *
>    * @extents should be in big-endian */
>   static int nbd_co_send_extents(NBDClient *client, uint64_t handle,
>                                  NBDExtent *extents, unsigned nb_extents,
> @@ -1805,7 +1886,7 @@ static int nbd_co_send_extents(NBDClient *client, uint64_t handle,
>           {.iov_base = extents, .iov_len = nb_extents * sizeof(extents[0])}
>       };
>   
> -    set_be_chunk(&chunk.h, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_BLOCK_STATUS,
> +    set_be_chunk(&chunk.h, 0, NBD_REPLY_TYPE_BLOCK_STATUS,

Worth a bool parameter to send the flag automatically on the last context?

>                    handle, sizeof(chunk) - sizeof(chunk.h) + iov[1].iov_len);
>       stl_be_p(&chunk.context_id, context_id);
>   
> @@ -1830,6 +1911,97 @@ static int nbd_co_send_block_status(NBDClient *client, uint64_t handle,
>       return nbd_co_send_extents(client, handle, &extent, 1, context_id, errp);
>   }
>   
> +/* Set several extents, describing region of given @length with given @flags.
> + * Do not set more than @nb_extents, return number of set extents.
> + */
> +static unsigned add_extents(NBDExtent *extents, unsigned nb_extents,
> +                            uint64_t length, uint32_t flags)
> +{
> +    unsigned i = 0;
> +    uint32_t max_extent = QEMU_ALIGN_DOWN(INT32_MAX, BDRV_SECTOR_SIZE);

This is too small of a granularity wrong when the server advertised 4k 
alignment during NBD_OPT_GO; it should probably refer to 
bs->bl.request_alignment.

> +    uint32_t max_extent_be = cpu_to_be32(max_extent);
> +    uint32_t flags_be = cpu_to_be32(flags);
> +
> +    for (i = 0; i < nb_extents && length > max_extent;
> +         i++, length -= max_extent)
> +    {
> +        extents[i].length = max_extent_be;
> +        extents[i].flags = flags_be;
> +    }
> +
> +    if (length > 0 && i < nb_extents) {
> +        extents[i].length = cpu_to_be32(length);
> +        extents[i].flags = flags_be;
> +        i++;
> +    }
> +
> +    return i;
> +}
> +
> +static unsigned bitmap_to_extents(BdrvDirtyBitmap *bitmap, uint64_t offset,
> +                                  uint64_t length, NBDExtent *extents,
> +                                  unsigned nb_extents, bool dont_fragment)
> +{
> +    uint64_t begin = offset, end;
> +    uint64_t overall_end = offset + length;
> +    unsigned i = 0;
> +    BdrvDirtyBitmapIter *it;
> +    bool dirty;
> +
> +    bdrv_dirty_bitmap_lock(bitmap);
> +
> +    it = bdrv_dirty_iter_new(bitmap);
> +    dirty = bdrv_get_dirty_locked(NULL, bitmap, offset);
> +
> +    while (begin < overall_end && i < nb_extents) {
> +        if (dirty) {
> +            end = bdrv_dirty_bitmap_next_zero(bitmap, begin);
> +        } else {
> +            bdrv_set_dirty_iter(it, begin);
> +            end = bdrv_dirty_iter_next(it);
> +        }
> +        if (end == -1) {
> +            end = bdrv_dirty_bitmap_size(bitmap);
> +        }
> +        if (dont_fragment && end > overall_end) {
> +            /* We MUST not exceed requested region if NBD_CMD_FLAG_REQ_ONE flag
> +             * is set */
> +            end = overall_end;
> +        }
> +
> +        i += add_extents(extents + i, nb_extents - i, end - begin,
> +                         dirty ? NBD_STATE_DIRTY : 0);
> +        begin = end;
> +        dirty = !dirty;
> +    }
> +
> +    bdrv_dirty_iter_free(it);
> +
> +    bdrv_dirty_bitmap_unlock(bitmap);
> +
> +    return i;
> +}
> +
> +static int nbd_co_send_bitmap(NBDClient *client, uint64_t handle,
> +                              BdrvDirtyBitmap *bitmap, uint64_t offset,
> +                              uint64_t length, bool dont_fragment,
> +                              uint32_t context_id, Error **errp)
> +{
> +    int ret;
> +    unsigned nb_extents = dont_fragment ? 1 : NBD_MAX_BITMAP_EXTENTS;
> +    NBDExtent *extents = g_new(NBDExtent, nb_extents);
> +
> +    nb_extents = bitmap_to_extents(bitmap, offset, length, extents, nb_extents,
> +                                   dont_fragment);
> +
> +    ret = nbd_co_send_extents(client, handle, extents, nb_extents, context_id,
> +                              errp);

Worth a trace when extents are sent?

> +
> +    g_free(extents);
> +
> +    return ret;
> +}
> +
>   /* nbd_co_receive_request
>    * Collect a client request. Return 0 if request looks valid, -EIO to drop
>    * connection right away, and any other negative value to report an error to
> @@ -2043,11 +2215,33 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
>                                         "discard failed", errp);
>   
>       case NBD_CMD_BLOCK_STATUS:
> -        if (client->export_meta.valid && client->export_meta.base_allocation) {
> -            return nbd_co_send_block_status(client, request->handle,
> -                                            blk_bs(exp->blk), request->from,
> -                                            request->len,
> -                                            NBD_META_ID_BASE_ALLOCATION, errp);
> +        if (client->export_meta.valid &&
> +            (client->export_meta.base_allocation ||
> +             client->export_meta.dirty_bitmap))
> +        {
> +            if (client->export_meta.base_allocation) {
> +                ret = nbd_co_send_block_status(client, request->handle,
> +                                               blk_bs(exp->blk), request->from,
> +                                               request->len,
> +                                               NBD_META_ID_BASE_ALLOCATION,
> +                                               errp);
> +                if (ret < 0) {
> +                    return ret;
> +                }
> +            }
> +
> +            if (client->export_meta.dirty_bitmap) {
> +                ret = nbd_co_send_bitmap(client, request->handle,
> +                                         client->exp->export_bitmap,
> +                                         request->from, request->len,
> +                                         request->flags & NBD_CMD_FLAG_REQ_ONE,
> +                                         NBD_META_ID_DIRTY_BITMAP, errp);
> +                if (ret < 0) {
> +                    return ret;
> +                }
> +            }
> +
> +            return nbd_co_send_structured_done(client, request->handle, errp);
>           } else {
>               return nbd_send_generic_reply(client, request->handle, -EINVAL,
>                                             "CMD_BLOCK_STATUS not negotiated",
> @@ -2199,3 +2393,44 @@ void nbd_client_new(NBDExport *exp,
>       co = qemu_coroutine_create(nbd_co_client_start, client);
>       qemu_coroutine_enter(co);
>   }
> +
> +void nbd_export_bitmap(NBDExport *exp, const char *bitmap,
> +                       const char *bitmap_export_name, Error **errp)
> +{
> +    BdrvDirtyBitmap *bm = NULL;
> +    BlockDriverState *bs = blk_bs(exp->blk);
> +
> +    if (exp->export_bitmap) {
> +        error_setg(errp, "Export bitmap is already set");
> +        return;
> +    }
> +
> +    while (true) {
> +        bm = bdrv_find_dirty_bitmap(bs, bitmap);
> +        if (bm != NULL || bs->backing == NULL) {
> +            break;
> +        }
> +
> +        bs = bs->backing->bs;
> +    }

Is searching for the dirty bitmap in the backing chain always the wisest 
thing to do?  I guess it works, but it seems a bit odd on first glance.

> +
> +    if (bm == NULL) {
> +        error_setg(errp, "Bitmap '%s' is not found", bitmap);
> +        return;
> +    }
> +
> +    if (bdrv_dirty_bitmap_enabled(bm)) {
> +        error_setg(errp, "Bitmap '%s' is enabled", bitmap);
> +        return;
> +    }
> +
> +    if (bdrv_dirty_bitmap_qmp_locked(bm)) {
> +        error_setg(errp, "Bitmap '%s' is locked", bitmap);
> +        return;
> +    }
> +
> +    bdrv_dirty_bitmap_set_qmp_locked(bm, true);
> +    exp->export_bitmap = bm;
> +    exp->export_bitmap_context =
> +            g_strdup_printf("qemu:dirty-bitmap:%s", bitmap_export_name);
> +}
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

  reply	other threads:[~2018-06-20 11:24 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-09 15:17 [Qemu-devel] [PATCH v5 0/6] NBD export bitmaps Vladimir Sementsov-Ogievskiy
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 1/6] nbd/server: fix trace Vladimir Sementsov-Ogievskiy
2018-06-19 18:39   ` Eric Blake
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 2/6] nbd/server: refactor NBDExportMetaContexts Vladimir Sementsov-Ogievskiy
2018-06-19 19:03   ` Eric Blake
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 3/6] nbd/server: add nbd_meta_empty_or_pattern helper Vladimir Sementsov-Ogievskiy
2018-06-19 20:24   ` Eric Blake
2018-06-20  9:43     ` Vladimir Sementsov-Ogievskiy
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 4/6] nbd/server: implement dirty bitmap export Vladimir Sementsov-Ogievskiy
2018-06-20 11:24   ` Eric Blake [this message]
2018-06-20 14:04     ` Vladimir Sementsov-Ogievskiy
2018-06-20 15:43     ` Eric Blake
2018-06-20 15:58       ` Eric Blake
2018-06-20 16:27   ` Eric Blake
2018-06-20 17:04     ` Vladimir Sementsov-Ogievskiy
2018-06-20 18:09       ` Eric Blake
2018-06-21 10:09         ` Vladimir Sementsov-Ogievskiy
2018-09-14 16:22         ` Vladimir Sementsov-Ogievskiy
2018-11-29  4:34   ` Eric Blake
2019-01-09 19:21   ` Eric Blake
2019-01-10  7:15     ` Eric Blake
2019-01-17 21:09     ` John Snow
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 5/6] qapi: new qmp command nbd-server-add-bitmap Vladimir Sementsov-Ogievskiy
2018-06-20 11:26   ` Eric Blake
2018-06-20 14:13     ` Vladimir Sementsov-Ogievskiy
2018-06-20 18:14       ` Eric Blake
2018-06-21 10:10         ` Vladimir Sementsov-Ogievskiy
2018-06-21 10:23       ` Nikolay Shirokovskiy
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 6/6] docs/interop: add nbd.txt Vladimir Sementsov-Ogievskiy
2018-06-20 11:33   ` Eric Blake
2018-06-20 14:16     ` Vladimir Sementsov-Ogievskiy
2018-06-20 20:58       ` [Qemu-devel] [Qemu-block] " John Snow
2018-06-21 15:59         ` Vladimir Sementsov-Ogievskiy
2018-06-21 22:10           ` [Qemu-devel] Incremental Backup Status (Was: Re: [Qemu-block] [PATCH v5 6/6] docs/interop: add nbd.txt) John Snow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ab9dd4e-1c79-fbe2-8e4b-6563e3775366@redhat.com \
    --to=eblake@redhat.com \
    --cc=armbru@redhat.com \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.